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Preface 


The Indian National Science Academy on the occasion of the Golden Jubilee 
Celebration (Fifty years of India’s Independence) decided to publish a 
number of monographs on the selected fields. The editorial board of INSA 
invited us to prepare a special monograph in Number Theory. In reponse to 
this assignment, we invited several eminent Number Theorists to contribute 
expository/research articles for this monograph on Number Theory. Al- 
though some of those invited, due to other preoccupations could not respond 
positively to our invitation, we did receive fairly encouraging response from 
many eminent and creative number theorists throughout the world. These 
articles are presented herewith in a logical order. 


We are grateful to all those mathematicians who have sent us their articles. 
We hope that this monograph will have a significant impact on further 
development tn this subject. 


R. P. Bambah 
V.C. Dumir 
R. J. Hans-Gill 
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A Centennial History of the Prime Number Theorem 
Tom M. Apostol 


The Prime Number Theorem 


Among the thousands of discoveries made by mathematicians over the centuries, some stand 
out as significant landmarks. One of these is the prime number theorem, which describes 
the asymptotic distribution of prime numbers. It can be stated in various equivalent forms, 
two of which are: 


x 
w(x) ~ ; as X —> OW, (1) 
re) 


and 
Pn ~ nlogn asn — oo. (2) 


In (1), 2 (x) denotes the number of primes p < x for any x > O. In (2), p, is the nth prime 
number when they are listed in increasing order; thus, pj = 2, p2 = 3, p3 =5,.... 

Interest in primes goes back to Euclid’s Elements (Book [X, Proposition 20) where it is 
shown that there are infinitely many primes. Euclid’s theorem implies that (+) — oo as 
xX — ov, and that p, — co asn — ov. 

The behaviour of (x) as a function of x, and of p, as a function of n, have been the 
object of intense study by many celebrated mathematicians. Around 1791 Gauss (at age 14) 
conjectured that zr (x) is approximately equal to the integral i dt/logt. This is now called 
the logarithmic integral and is denoted by Li(x). Except for numerical evidence, Gauss 
provided no argument to support his conjecture. 

The first proof of the prime number theorem came acentury later, in 1896, the culmination 
of efforts by many talented individuals. This paper tells the story of this proof, which has 
several heros. First, there is Euclid himself, who started it all with his theorem on the 
infinitude of the primes. Then we have Euler, Gauss, Legendre, Dirichlet, Chebyshev, 
Riemann, von Mangoldt, Hadamard, and de la Vallée Poussin. As the story unfolds, it 
reveals how each contributed a significant stepping stone to ultimate success. 


Euler’s Contribution 


In 1737, Euler was the first to introduce methods of analysis to study primes when he proved 
Euclid’s theorem on the infinitude of primes by showing that the infinite series 


I 
2» 
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(extended over all primes p) is divergent. Every calculus student learns that the harmonic 
series )-7° | 1 diverges. If you discard all the composite numbers in the harmonic series 
there are still enough terms left over to make the series diverge. Euler’s proof was based 


on the formal identity 
= 

as roe, (3) 
n=] p 


which he established for real s > 1. The infinite product on the right is extended over all 
primes p. It’s not hard to see where this identity comes from. The key is our old friend the 
geometric series: 
2 3 

sees ee PP RT ae he ey 

l—x 
which is also familiar to every student of calculus. The series converges whenever |x| < 1. 
Take x = a and expand each factor in the infinite product as a geometric series to get 


l | | 
fp oe ee 


Note that each denominator on the right is a prime power p”™ raised to the power s. When 
you multiply all these series together and rearrange the terms according to increasing 
denominators you end up with the series })°~, 4. because of the fundamental theorem 
of arithmetic, which states that every integer n greater than | can be factored in one and 
only one way as a product of prime powers, apart from the order of the factors. Because 
we are multiplying an infinite number of infinite series, there are some delicate questions 
of convergence involved. Euler ignored these questions but the steps can be justified. 

Euler’s infinite product identity can be regarded as the analytic equivalent of the funda- 
mental theorem of arithmetic, and it forms the basis for nearly all subsequent work on the 
distribution of primes. 


The Conjectures of Gauss and Legendre 


It seems that Gauss? and Legendre® were the first to consider the distribution of primes. In 
1791, at age 14, Gauss examined a table of primes compiled by Lambert. Gauss counted the 
primes in blocks of 100, 1000, and 10,000 consecutive integers and made a note in his diary 
that the function 1/log x was a good approximation to the average density of distribution 
(number of primes per unit interval). A few years later in 1796, when Vega published an 
extended table of primes up to 400,031, Gauss substantiated his hypothesis further, and he 
kept returning to this work as new tables of primes appeared. Many years later, in 1849, he 
communicated his observations in a letter to the astronomer Encke®, and the results were 
published posthumously in 1862. (Gauss died in 1855.) Based on tables listing primes up 
to 3 million, Gauss observed that z (x) is closely approximated by the integral of the density 
function. 

Here are some excerpts adapted from his letter to Encke. This table shows (x) and 
Li(x) for various x between 5 million and 3 million. 
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x (XxX) LI(x) % error 


500,000 41,556 41,604.4 a 
1,000,000 = 78,501 78,627.5 16 
1,500,000 114,112 114,263.1 13 
2,000,000 148,883 149,054.8 11 
2,500,000 183,016 183,245.0 12 
3,000,000 216,745 216,970.6 10 
The agreement between x(x) and Li(x) is certainly striking. The error in each of these 
approximations is only about one-tenth of one percent. 

Legendre® also considered this question in the 2nd edition of his number theory text, 
published in 1808. A page in this book lists approximations for 2 (x) for various x up toa 
million. Legendre asserts that (x) is closely approximated by the quotient 


x 
log x — 1.08366 


On a later page he states that ‘ 
ES logx — A(x)’ 
where A(x) is a function of x that approaches 1.08366 as x — oc. 

Neither Gauss nor Legendre revealed how he arrived at the appearance of the natural 
logarithm in their formulas. Nor did they make any explicit statement about how good 
these approximations were thought to be outside the range of the tables. It is generally 
understood that both intended to imply that the ratio of 2 (x) to each approximating formula 
on the right tends to the limit | as x tends to infinity. 

Integration by parts shows that the value of Gauss’s integral 1s approximately x / log x, 
so both Gauss’s and Legendre’s relations are equivalent to the simpler asymptotic relation 


x 
W(x) ~ ; as X —> O, 
oO 


which means 


This result is now known as the prime number theorem, one of the most astonishing 
results in all of mathematics. It describes a simple relation between primes and the natural 
logarithm function which, at first glance, has nothing to do with prime numbers. 


How to Conjecture the Prime Number Theorem 
by Examining a Table of Primes! 


It is natural to ask why Gauss and Legendre used the natural logarithm in their formulas. 
They did not leave any written clues; they simply recorded their formulas and the supporting 
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data. Let’s see how one might be led to conjecture the prime number theorem by examining 
a table of primes. Here are some entries from a table of values of 2 (x):- 


x 107 10* ~=10° 108 1910 10/2 10!4 
m(x) 25 1,229 78,498 5,761,455 455,052,512 37,607,912,018 3,204,941,750,802 


We have listed 2 (x) for successive powers of 10 increasing by a factor of 107. Gauss 
had access to tables that only went up to 3 million. We have added the last four entries from 
more modern tables. What can we learn by looking at these numbers? Since we want to 
find how fast 2(x) grows with x, it is natural to look at the ratio x /2 (x) that compares the 
two quantities. This table shows the corresponding ratios:- 


x 107 «104 ~—s-: 10° 108 10!° 10!2 10/4 
m(x) 25 1,229 78,498 5,761,455 455,052,512 37,607,912,018 3,204,941,750,802 
=t5 4.000 8.137 12.739 17.357 21.975 26.590 31.202 


Notice the gaps between successive entries in this last row of numbers: 


gaps: 4.137 4602 4618 4618 4615 4.612 


As each exponent of 10 increases by 2, the ratio ee increases by an almost constant amount, 


4.6, which is 2.3 times the change in the exponent of 10. But if x is expressed as a power 
of 10, then the exponent of x is the logarithm of x to the base 10. So the table indicates 
that the change in the ratio Tes is approximately 2.3 times the change in log;y x. What 
about this strange factor 2.3? A bright fourteen-year old such as Gauss would immediately 
realize that 2.3 is very nearly the logarithm of 10 to the base e, so 


2.3logig x = (log, 10)(log,9 x) = log, x. 


This suggests that we compare the ratio x/m(x) with log x (the natural logarithm of x). 
Our table now looks like this:- 


x 107 +~10% ~=—«:10° 108 1910 10/2 10!4 

m(x) 25 1,229 78,498 5,761,455 455,052,512 37,607,912,018 3,204,941,750,802 
aixy 4000 8.137 12.739 17.357 21.975 26.590 31.202 

logx 4.605 9.210 13.816 18.421 23.026 27.361 32.236 

log x 

Gry (1151 1.132 1.085 1.061 1.048 1.039 1.033 


Anyone who looks at this last row of numbers would surely be tempted to conjecture that 
they approach | as x — oo. Gauss, Legendre, and many other eminent mathematicians of 
the early 19th century apparently thought so, but they were not able to prove it. As far as 
we know, neither Gauss nor Legendre made any significant progress toward a proof. 
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Legendre’s Conjecture on Primes in Arithmetical Progressions: 
Dirichlet’s Contribution 


In the 1808 edition of his number theory book, Legendre formulated another conjecture 
on prime numbers that plays a role in this story. It has to do with primes in arithmetic 
progressions. An arithmetic progression of integers with first term 4 and common difference 
k consists of all numbers of the form kn + h as n runs through all the nonnegative integers 
O,1,2,....If A and k have acommon prime factor p then each term of the progression is 
divisible by p and there can be no more than one prime in the progression. Legendre asked 
whether there must be infinitely many primes in the progression if h and k have no common 
prime factor. 

In acelebrated memoir published in 1837, Dirichlet* showed that every arithmetic pro- 
gression kn + h, where h and k have no prime factor in common, must contain infinitely 
many primes, a result now known as Dirichlet’s theorem on primes in arithmetical pro- 
gressions. Inspired by Euler’s proof of the infinitude of primes, Dirichlet used an ingenious 
argument to show that the sum of the reciprocals of all the primes in the progression kn +h 
diverges. He did this by establishing the following asymptotic formula: 


l 
> — pease a Sten) as X —> OOo. 
por ee 
p=h(mod 4) 


This states that the partial sums of the mae containing the reciprocals of all primes p < x in 
the progression has the asymptotic value —~ = log(log x), where g(k) is the Euler g function 
(the number of integers from | to k that ‘have no prime factor in common with k). Since 
log(log x) — oo as x — ov this formula implies that the infinite series extended over all 
primes in the progression diverges, so the progression must contain infinitely many primes. 

Dirichlet’s proof of the infinitude of primes in arithmetical progressions was a landmark 
achievement. It marked the birth of analytic number theory, a new branch of mathematics 
in which problems concerning the integers are attacked by methods of analysis. This paper 
undoubtedly had an effect on subsequent work on the prime number theorem. The ideas 
introduced in Dirichlet’s proof laid the basis for areas of mathematical research that have 
had profound applications to both analytic and algebraic number theory. 


Chebyshev’s Contributions 


After Dirichlet’s work on primes in arithmetical progressions in 1837, the first real step 
towards a proof of the prime number theorem itself was made in 1848 by the Russian 
mathematician, P.L. Chebyshev!. He proved that: 

If the ratio Tis) fog x has a limit as x—> 00, then this limit must equal 1. 

But Chebyshev was unable to prove that this ratio actually tends to a limit. In 1850, he? 
proved that z (x) lies between two numerical constants times the ratio x / log x, so x/ log x 
does, indeed, represent the true order of magnitude of z (x). The problem is difficult because 
there is no useful formula that generates the primes, and there is no obvious way to obtain 
information about their distribution. 
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Chebyshev introduced two new functions }(x) and w(x) that are somewhat more 
convenient to deal with than 2 (x). They are defined as follows:- 


d(x) = ) log p 
psx 


and 
OO 


W(x) = > oy log p= P(x) +82) + BD) He. 


m=1 p™<x 
In the sum over m the term with m = 1 is just 0(x). The term with m = 2 1s the sum 
a 1 ; 
over all primes with p? < x, or p < x, so this is (x2). The term with m = 3 is 


1 ° e . . 
0 (x3), and so on. The sum on m is a finite sum because it terminates when p” > x, or 
m > (log x)/(log p). The second term is 


l 
(x2) = \° log p < /x log /x = 5% loge. 
psJx 


The remaining terms are smaller, and there are at most (log x)/log 2 terms altogether, so 
w(x) and #(x) are related as follows:- 


J/x log? x 


O<wW(x)-vB(x)S PIge2 


Divide this inequality by x and let x — oo to obtain the limit relation 


lim Ge ws a) =) 


x— 00 XxX a 


This shows that both ratios w(x)/x and 3 (x)/x exhibit the same asymptotic behaviour for 
large x. 


The Connection with z (x) 


As might be expected, the Chebyshev functions are related to (x). For example, since 
(x) iS a Step function with a jump of | at each prime, we can use Stieltjes integrals to write 


o(x) = >| log p = / logtdm(t). 


psx 


Integration by parts gives us 


0(x) = w(x) logx -| —— dt, 
1 


from which we find 


B(x) _ (x) logx 7 I [ mt (t) 
1 


x Xx x 
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From this and the relation between 7 (x) and w(x) it is easy to show that the following three 
statements are equivalent formulations of the prime number theorem: 


. m(x)logx 
lin, —_——— = I, 
X— OO XxX 
ve 
a accLenee 
X—> OO xX 
lim 2) a 
Xx CO xX 


Chebyshev proved that if any one of these three limits exists and equals 1, then the others 

also exist and equal 1. But he was unable to prove that any of these limits exists. 
Incidentally, in 1881 Sylvester wrote a paper in the American Journal of Mathematics 

describing Chebyshev’s methods. He praised Chebyshev’s contributions and concluded: 


“We shall probably have to wait until someone is born into the world as far surpassing 
Chebyshev in insight and penetration as Chebyshev has proved himself superior in these 
qualities to the ordinary run of mankind.’ 


Well, when Sylvester wrote these words, three people had already been born with these 
qualities: Riemann, born in 1826, Hadamard, born in 1865, and de la Vallée Poussin, born 
in 1866, the year Riemann died. 


Riemann’s Contributions 


The next significant step toward the proof of the prime number theorem was made in 1859 
by G.FB. Riemann’ in a famous 8-page paper, the only one he wrote on prime numbers. 

Riemann attacked the problem with a new method based on Euler’s infinite product 
identity (3). He replaced the real variable s by a complex variable, s = o + it, and showed 
that the distribution of primes is intimately related to properties of the function f(s) obtained 
by analytic continuation of the series 


= 1 
ts) =) (4) 
n=1 


The function f(s), now called the Riemann zeta function, is initially defined by the series 
in (4) in its half-plane of convergence o > |. Riemann showed that this function can be 
continued analytically beyond the line o = 1. It exists as a meromorphic function in the 
entire s-plane, its only singularity being a simple pole at s = 1 with residue 1. Moreover, 
¢(s) and ¢(1 — s) are related by the functional equation 

It S 


t(1 —s) = 2(22)-*F'(s) cos (=) t(s). 


We'll see in a moment that the zeros of the zeta function play an important role in the 
distribution of primes. Where are the zeros? First we note that ¢(s) is represented by 
a convergent infinite product foro > 1; convergent products are never zero, so the zeta 
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function has no zeros foro > 1. But it does have zeros in the negative half-plane. This can 
be seen from the functional equation. When s is a negative integer, say s = —n, (1 — s) 
becomes ¢(1 +) which is finite. The functional equation gives us 


wn 
C(1+n) = 2(2m)"T(—n) cos (=) C(—n). 


But the gamma function has a simple pole at each negative integer so this pole must be 
canceled by a zero of one of the other factors. The factor cos(%*) is zero at the negative 
odd integers; but it doesn’t vanish when n is even, so when n is even the pole of the gamma 
function must be canceled by a zero of the zeta function. In other words, 


C¢(—2n) =0 for n> 1. 


These are called the trivial zeros of ¢(s). There are infinitely many trivial zeros at the points 
—2, —4, —6, etc. The remaining zeros of the zeta function must lie in the strip 0 < o < 1, 
which is called the critical strip. Riemann showed that there were infinitely many zeros 
in the critical strip. The nontrivial zeros are denoted by p. They occur in conjugate pairs 
and the functional equation shows that they are symmetrically located about the midline 
C= 5, which is called the critical line. 

To connect the zeta function with the distribution of primes, Riemann took the logarithmic 
derivative of the Euler product. Taking logarithms in the equation 


l 
¢(s) = I] , 
we find 


log f(s) = — ) log(1 — p~*). 
P 


Following Riemann, we differentiate this equation, but the details presented here are not 
the same as those given by Riemann. Differentiate and change sign to get 


ES) pe _wovplgp 
f(s) “LS log p = ee ms ° 


m=l| Pp P 


But w(x) is a step function with jump log p at the prime powers, so the last sum can be 
written as a Riemann-Stieltjes integral and we get 


g'(s) ia dp (x) =| W(x) 
l I 


f(s) _ xs stl 


after integration by parts. This is a Mellin transform of the function w(x), and it can be 
inverted to give 


dx 


w(x) = vt sue x* o'(s) 


21 Jo-iogo 5S ES) 
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The path of integration 1s along the vertical line o = 2. Riemann’s idea was to move the 
line of integration as far to the left as possible, all the way to —oo. As we cross the poles, 
we have to add the residues of the integrand. 

The integrand has a pole at s = 1 from the factor co , which has residue —1, so the pole 
at s = | has residue x. 

There is a pole at s = 0 from the factor 1/s with residue —£O. 

And there is a pole at each zero p of [(s) with residue —x?/p. 

We separate the nontrivial zeros from the trivial zeros. This gives us 


jes & 222 3y eye 
p 
(s=1) (S=0) (=p) (S=—2n) 


where the sum over p is the contribution from the nontrivial zeros, and the sum on n is the 
contribution from the trivial zeros. The sum over n is just the power series for 5 log(1—x~7), 
and this tends to O as x tends to 00, so we have the explicit formula 


7) 
(0) 


p 
W(x) =x — DIX + ofl) as x — 00, 
p 


This remarkable formula tells us that w(x) is equal to x, plus a constant, plus an error term 
depending on the nontrivial zeros of the Riemann zeta function. We divide by x, and rewrite 
this as follows: 


eo 
dl eH +o(1) asx — ow. 
x a 


Incidentally, Riemann expressed his explicit formula somewhat differently. There are 
many technical details that have to be taken care of to justify this process, and Riemann was 
not able to provide full justification. The version given here was derived by von Mangoldt!® 
in 1894, with full justification, and it has superceded Riemann’s formula in more modern 
treatments. This explicit formula tells us that the prime number theorem is equivalent to 


the statement 
lim 
jim DU 
p 


If the limit could be taken term by term we would obtain the prime number theorem if we 
could show that each term x?~! tends to 0, and this would follow if Re p < 1, or, in other 
words, if the zeta function has no zeros on the line o = 1. Riemann was not able to show 
this. He said it was likely that all the nontrivial zeros of the zeta function had real part 4, but 
admitted that he could not prove this. This statement has become one of the most famous 
unsolved problems in mathematics. It is called the Riemann hypothesis, and it states that 
each nontrivial zero of the zeta function has the form p = 5+ it. 
The Riemann hypothesis implies (and is implied by) the relation 


xen! 


= 0. 
p 


W(x) =x+ O(x2t€) (for every € > Q). 
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Riemann died in 1866 at the age of 40 and, as far as we know, he made no further progress 
toward a proof of the prime number theorem. But Riemann’s work stimulated a great deal 
of activity on this problem. 

In 1885, Stieltjes announced that he had proved the Riemann hypothesis and, therefore, 
the prime number theorem, but Stieltjes never published a proof. Stieltjes died in 1894 
without having either substantiated or retracted his claim of having proved the Riemann 
hypothesis. 

In an effort to justify some of Riemann’s formal techniques, Hadamard developed the 
theory of entire functions of finite order, and in 1893 he provided proofs of some of the 
results that Riemann had stated with incomplete proofs. Following Hadamard’s work on 
entire functions, von Mangoldt in 1894 gave the first correct proof of the explicit formula 
for w(x) referred to above. 

Then in 1896, a little more than a century ago, the prime number theorem was finally 
proved — independently and almost simultaneously — by two young mathematicians, Jacques 
Hadamard, age 31, and Charles-Jean de la Vallée Poussin, age 30. We turn now to some 
remarks about their contributions. 


Contributions by Hadamard and de la Vallée Poussin 


Following Riemann, both Hadamard and de la Vallée Poussin realized that the prime number 
theorem follows from the fact that the zeta function has no zeros on the line o = 1, and 
both proved this fact for the first time, but they used different methods. 

There is a diplomatic remark in Hadamard’s paper’ that has some historical interest. He 
says: “Stieltjes proved, in accordance with Riemann’s expectations, that all the zeros of 
€(s) in the critical strip have the form s+ it, where t 1s real; but his proof has never been 
published, and it has not even been established that the function ¢ has no zeros on the line 
o = 1. Itis this last assertion that I propose to prove here.” And so he did, thus providing 
a proof of the prime number theorem. 

De la Vallée Poussin’s first contributions to prime numbers were published in 1896 in 
three memoirs? the first contains his proof of the prime number theorem, the second extends 
his method to obtain a prime number theorem for arithmetic progressions, and the third is 
on primes representable by quadratic forms. 

Of the two proofs that ¢(1 + it) #4 0, Hadamard’s is the simpler, and it is based on a 
relation between ¢(o + it) and (0 + 2it). In atwo-page note at the end of his third memorr, 
de la Vallée Poussin acknowledges that Hadamard’s method was the simpler of the two, and 
then shows how Hadamard’s method can be simplified even further. He adapts Hadamard’s 
idea to the function ¢'(s)/¢(s) and shows in just a few lines that the nonvanishing of f(s) 
on the line o = 1 follows quite easily from the elementary trigonometric identity for the 
cosine of a double angle: 

1 —cos20 = 2(1 -— cos’ 6), 


together with the inequality 0 < 1 — cos@ < 2. 
The two trigonometric relations imply 


34+ 4cosé + cos 26 > 0, 
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a trigonometric inequality that de la Vallée Poussin used to establish the nonvanishing of 
¢(1+ it). He then points out that the use of these trigonometric relations shortens his original 
proof in the first memoir by 24 pages, and that the same relations can be used to simplify 
the second and third memoirs as well. 

Here’s the connection with the zeta function. From the formula for the logarithm of the 


zeta function, 
logg(s) = )- 3 


P m=1 


a 


obtained earlier, we see that ((s) is the exponential of this double sum, and the abso- 
lute value of ¢(s) is the exponential of its real part. Now p~”* = exp{—ms log p} = 
p”° exp{—imt log p}, so we get 


ene bap Ba _— ae P) r 


De la Vallée Poussin then established the inequality 
CONS(o + itl" |E(o + 2it)| = I (5) 


as follows. The product on the left 1s equal to 


3+ 4cosé 20 
o| BEeeAeh sam) where 6 = mt log p. 


mpms 
P m=!) P 


Because of the trigonometric inequality, each term of this series is > 0 so the exponential 
is > 1, hence we obtain inequality (5). From this inequality, he deduced that ¢(1 + it) 4 0 
for all real t. The argument, by contradiction, is very simple. Suppose that ¢(1 + it) = 0 
for some real t 4 0. Inequality (5) can be rewritten as follows by dividing both sides by 
o — 1, whereo > |: 


buen ad 
C(o + 1t) —C(1 +1) It(o + 2it)| > 


ers 
(6 — 1)9°¢°(0) ene mar 


Now let o approach | from the right. On the left, the product (o — 1)f(o) tends to | 
because ¢(s) has a pole at 1 with residue 1, the second factor tends to the fourth power 
of the absolute value of the derivative ¢’(1), which is finite, the third factor tends to the 
absolute value of (1 + 2it), which is finite, so the left member tends to a finite value. But 
the right member tends to infinity, and we get a contradiction! Therefore, ¢(1 + it) 4 0 for 
all real t, and, as noted above, this implies the prime number theorem. 

The original memoir of de la Vallée Poussin develops more profound knowledge of the 
properties of ¢(s). For example he showed that ¢(o + it) # 0 in a region of the form 
o > |—A/logt for sufficiently large t. He later used these properties to extend the prime 
number theorem with the more general asymptotic formula 


M(x) ~ ! --+(n— 1)! as X —> OO. 


x 
aan 
log x (log Onl (log x)” 
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The case n = | 1s the prime number theorem. He also expressed the prime number theorem 
in the form 


w(x) = Li(x) + O(xe~ tv °8*), 


where a is a positive constant and Li(x) is the logarithmic integral introduced by Gauss. 
Apart from the value of a, the exponential factor in the error term remained unimproved 
until 1921, when Littlewood showed that the function log x under the square root sign could 
be replaced by log x log log x. This has only historical interest now because the error was 
later improved by I.M. Vinogradov and H.M. Korobov, who showed that ,/log x could be 
replaced by (log x)?/9. 

There is also a prime number theorem for arithmetic progression first proved by de la 
Vallée Poussin. It states that the number of primes < x in the progression kn + h is 
asymptotic to ab ner There are y(k) reduced residue classes modulo k, and de la Vallée 
Poussin’s theorem shows that in the limit the primes are equally distributed among each of 
these y(k) reduced residue classes. Each residue class gets its fair share of primes. 


Elementary Proof of the Prime Number Theorem 


The first proof of the prime number theorem, as given by Hadamard and de la Vallée 
Poussin, was simplified by Landau, and by Hardy & Littlewood, in the early part of the 
20th century, and new proofs were later discovered, by Norbert Wiener and others, all using 
sophisticated methods of real and complex analysis, and the fact that €(s) doesn’t vanish 
on the line o = |. Then in 1949, Paul Erdos and Atle Selberg discovered an elementary 
proof that makes no use of the Riemann zeta function or complex function theory. But this 
so-called elementary proof is more difficult to understand than the analytic proofs. 

In 1980 Donald J. Newman gave a concise version of the analytic proof that uses very 
little complex analysis beyond Cauchy’s theorem. Newman’s proof also uses the fact that 
¢(s) doesn’t vanish on the line o = 1, but he has a clever way of deducing this. 


Another Form of the Prime Number Theorem 


Earlier we stated that the prime number theorem can also be expressed by describing the 
asymptotic behaviour of the nth prime p,. We can deduce this behaviour from that of zr (x). 
Start with the prime number theorem in the form 


_ w(x)logx 
lim —————— 


X—> 0O X 


ee (1) 
Take logarithms of both sides to obtain 

Jim {log z(x) + log log x — log x} = 0, 
or 


| log | 
lim flog ( EEO 4. SEE ‘)| = 


x00 log x log x 
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Since log x — oo the last factor multiplying log x must tend to 0, so 


ia (TES 4 SEREX 1) =0 


x>oo\ logx log x 
The quotient op log tends to 0, hence 
gx 
. loga(x) 
lim ——— = 1 
x—oo logx 

Multiply this relation by (1), cancel log x, and we get 
(x) log r(x) _ 


lim 


xXx—>00 x 


l. 


Now let x = pn, so that 7(x) =n. Then the last formula becomes 


I 
eee 


n— oo p” 


This says that the nth prime p, is asymptotically equal to nlogn as n —> oO, or, in 
other words: 


For large n the nth prime grows like n log n. 


Itcan be shown that this also implies the prime number theorem, so this statement is logically 
equivalent to the prime number theorem. 


Concluding Remarks 


The prime number theorem is important not only because it makes an elegant and simple 
statement about primes and has many applications but also because much new mathematics 
was created in the attempts to find a proof. This is typical in number theory. Some problems, 
very simple to state, are often extremely difficult to solve, and mathematicians working on 
these problems often create new areas of mathematics of independent interest. 

Another example is the Fermat conjecture, which has received more publicity as an 
unsolved problem than any other result in mathematics. It is interesting to note that Gauss 
considered the Fermat conjecture to be of only minor importance, and refused to work on it. 

The prime number theorem and Fermat’s last theorem are two outstanding examples of 
problems that attract the intellectual curiosity of many individuals but resist efforts at solu- 
tion. Repeated failure by eminent mathematicians to settle these problems by known proce- 
dures stimulates the invention of new methods, new approaches, and new ideas that, in time, 
become part of the mainstream of mathematics, and even change the way mathematicians 
think about their subject. This is certainly true of the prime number theorem. Early attempts 
to prove the prime number theorem stimulated the development of the theory of functions 
of a complex variable, a branch of mathematics that is the life blood of mathematical anal- 
ysis. Efforts to prove Fermat’s last theorem led to the development of algebraic number 
theory, one of the most active areas of modern mathematical research, with ramifications 
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far beyond the Fermat equation. One unexpected application of algebraic number theory is 
in designing security systems for computers. 

In number theory alone there are hundreds of unsolved problems. New problems arise 
more rapidly that the old ones are solved, and many of the old ones have remained unsolved 
for centuries. Progress of our knowledge of numbers is advanced not only by what we 
already know about them, butalso by realizing that there is much we do not know about them. 
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Non-homogeneous Problems: 
Conjectures of Minkowski and Watson 


R.P. Bambah, V.C. Dumir and R.J. Hans-Gill 


§1 Introduction 


Here we shall survey the developments regarding two well known problems in Geometry 
of Numbers. The first is a conjecture of Minkowski about the product of non-homogeneous 
real linear forms. The second one is a conjecture of Watson concerning non-homogeneous 
real indefinite quadratic forms. Whereas the first one is still resisting solution in the general 
case, the second one has been completely proved. 

In order to be able to express the conjectures in a geometrical language also, we give 
some definitions. 

Here R” denotes the n-dimensional Euclidean space. For any n linearly independent 
points Aj,..., A, in R”, the set A = {u4,;A, +--- + UnAn : U4,..., Uy are integers} is 
called a lattice and A,,..., A, is called a basis of ~. The determinant of A is defined 
as d(A) = |det(A},..., A,)|. It equals the volume of the parallelotope generated by the 
vectors A},..., A, (called a cell of A). The lattice can be generated by infinitely many 
bases; but d(A) 1s independent of their choice. 

For any subset S of R”, we say that A is a covering lattice for S or that (S, A) is a 
covering if 

R” CU{S+A:A€EA/}, 
where 
S+tA={X+A:X € S} =translate of S through A. 


If f: R” > Rand S = {X : | f(X)| < 1} then / is a covering lattice for S if and only if 
for any P in R” there is a point A in A such that 


If(A+ P)| sl. 


This is equivalent to saying that for any P in R” and any basis Aj,..., An of A, there 
exist Integers U1, ..., Uy, Such that 


| f(uy Ay +--+ + un An + P)| < 1. 


§2 Minkowski’s Conjecture 


Let L; = aj1X1 +--+ + GjnXn, 1 < i <n, ben real linear forms inn variables x1, ..., Xn 
and having determinant A = det(a,;) #4 0. The following conjecture is attributed to 
H. Minkowski: 
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For any given real numbers c,,..., Cy, there exist integers x1, ...,Xn Such that 


\(L1 +1)... (in + €n)| < |A|/2". (*) 


Equality is necessary iff after a suitable unimodular transformation the linear forms 
L; have the form 2 c;x; for 1 <i <n. 

It is obvious that equality occurs in (*) for the cases mentioned in the conjecture. So this 
conjecture would be proved if we can show that for all other linear forms, (*) holds with 
strict inequality. 

In the geometrical language of coverings, Minkowski’s conjecture can be stated as 
follows: 

Any lattice A in R" is a covering lattice for the set 


Slits caXal 32>" dQ): 


Further, A is a covering lattice for the interior of S iff A does not have a basis consisting 
of the vectors of the type Bj = (0,...,0,b;,0,...,0),1< 1 <n. 
It is clear that it is enough to consider lattices of determinant 1. 


§2.1 Known Results: Minkowski’s Conjecture 


This result is trivial for n = 1. For n = 2, a proof was first given in 1899 by Minkowski! 
(See [107] pages 42-45 or [106] §4 Chap. XVI). Several mathematicians: Mordell!08:!09 
(1928, 1941) Landau®’ (1931), Perron!?? (1938), Pall!?8 (1943), Macbeath®?? (1948, 
1961), Sawyer! (1948) (see also Mordell!!° (1953)) and Heilbronn (See Hardy and Wright 
[83] §24.7 and page 413) have obtained a variety of proofs, partly in an effort to find a proof 
which would generalize to higher dimensions. 

The proof by Sawyer follows from an interesting result of Delone*® (1947), though he 
used a slightly weaker result. For any point A € R’, the set © = A+ A is called a grid 
and for any cell P of A and point Y in’, P + ¥ is called a cell of T. A cell of I is called 
a divided cell if it has a vertex in each of the quadrants. Delone proved that every grid in 
R? which has no point on the co-ordinate axes has a divided cell. Delone showed that the 
analogous result does not hold in R°. 

Minkowski’s conjecture has so far been proved for n < 5. Forn > 3, the proofs have 
been obtained by mathematicians listed below: 


n=3  Remak!° (1923) 
Davenport?® (1939) 
Birch and Swinnerton-Dyer” (1956) 
Narzullaev!!7 (1968) 


n=4 Dyson® (1948) 

Skubenko!#> (1973) 

Bambah and Woods!? (1974) 
n=5  Skubenko!3° (1973) 

Bambah and Woods!° (1980) 
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We briefly describe the three methods of approach that have been successful for n = 3. 
Only the first has been successfully extended to give proofs forn = 4, 5. 


J. Remak-Davenport method 
II. Birch and Swinnerton-Dyer method 
II. DOTU method 


I. Remak-Davenport method suggests that one way to approach the conjecture of 
Minkowski is to prove the following two results: 
(i) For any lattice A in R” there is an ellipsoid 


Esax, aa <1 


which contains no point of A other than O but has n linearly independent points of A on its 
boundary. 

(ii) If A is a lattice of determinant 1 and there is a sphere |X| < R which contains no 
point of A other than O and has n linearly independent points of A on its boundary then A 
is a covering lattice for |X| < /n/4. 

It can be easily seen that Minkowski’s conjecture (except for the determination of crit- 
ical cases) follows for any n for which both (i) and (ii) hold. Consider any lattice A of 
determinant |. On making the linear transformation 


Jaixi = (ay ...an)\/*" yj, 1 <i <n, 


which is an automorphism of the form x; ...X,, and replacing A by the transformed lattice 
we can suppose that the ellipsoid E found in (1) is a sphere. Then (ii) implies that A is a 
covering lattice for the sphere K : |X| < /(n/4). Using the inequality of arithmetic and 
geometric means we can easily see that K is contained in the region |x; ...x,| < 1/2”. 

For n = 3, Remak’s proof was lengthy and complicated. It was considerably simplified 
by Davenport*® (1939). Mahler!©? (1940) gave another proof of (ii). 

For n = 4, (ii) was proved by Hofreiter®* (1933). Dyson (1948) proved both (1) 
and (i1) and hence Minkowski’s conjecture for n = 4. The proof is considered difficult; 
proof of (1) used strong tools form Algebraic Topology. Cleaver*> (1965) gave another 
proof of (ii). A simpler proof was given by Woods!*° (1965). Using some results of 
Birch and Swinnerton-Dyer”° (1956) (mentioned below) which allow one to restrict the 
attention to lattices having no points on co-ordinate planes, Bambah and Woods!? (1974) 
gave proof of (1), hence completing an elementary proof of Minkowski’s conjecture for 
n = 4. Skubenko!*° (1973) had sketched another elementary proof of (i) forn = 4. He 
also gave a proof of (1) for n = 5, in which many details were somewhat obscure. The result 
(ii) had already been proved for n = 5 by Woods!4? (1965). Bambah and Woods!> (1980) 
gave clear detailed proof of (i) for n = 5 on the lines of Skubenko’s proof. Again, they 
used Birch and Swinnerton-Dyer’s result to restrict attention to lattices having no points in 
co-ordinate planes. 

For n = 6, the result (ii) has been proved by Woods!°° (1972), while counterpart (i) is 
still awaiting proof. 

Il. Birch and Swinnerton-Dyer”? (1956) gave a different proof of Minkowski’s conjecture 
for n = 3. It is based on a homogeneous reduction of linear forms. Though they could not 
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extend the method to obtain a proof of the conjecture for n = 4, they succeeded in proving 
very useful general results stated below: 

(A) Suppose that the Minkowski’s conjecture has been proved forn = 1,2,..., N—1. 
Then to prove if for n = N, it is enough to consider only those sets of forms L),..., Ly 
for which the homogeneous minimum 


my = inf{|L,...Ly,|:2x1,..., X, integers not all zero} 


is attained and non-zero. 

(B) Minkowski’s conjecture is true for all lattices in a neighbourhood of the conjectured 
critical lattice (such a neighbourhood is specified explicitly). 

The result (A) has been used in the proofs of Minkowski’s conjecture for n = 4,5 by 
Bambah and Woods, and Skubenko to restrict their attention to lattices having no points in 
the co-ordinate planes. 

Macbeath”! (1961) formed aconnection of the Minkowski’s conjecture with factorization 
of matrices in a particular manner. He called a non-singular matrix M of order n Minkowski 
if the lattice MZ” is a covering lattice for the region 


|xy...X,| < 27" |detM]. 


Thus Minkowski’s conjecture can be stated as: 
Every non-singular matrix of order n is a Minkowski matrix. 


Minkowski had himself proved that every rational non-singular matrix is a Minkowski 
matrix. The result (B) of Birch and Swinnerton-Dyer stated above can be restated as: 

For every n, there is a neighbourhood of the identity matrix which consists entirely of 
Minkowski matrices. Macbeath showed that more generally, every rational non-singular 
matrix has a neighbourhood consisting of Minkowski matrices. For this he used a very 
simple method. He called a non-singular matrix M a DOTU-matrix if it can be expressed 
as a product 

M = DOTU, 


where D is diagonal, O is orthogonal, U is unimodular and T is unit triangular 1.e., if 
T= (tj),t; = lforl <i <nandt; = Ofori < j. 

Macbeath proved that every DOTU-matrix is a Minkowski matrix. Further, he showed 
that if M = DOTU and O = (a;;) with det (a;,) # () then M is an inner point of the 
family of Minkowski Matrices. Since every rational non-singular matrix can be written as 
a product DTU, the result applies to these. Macbeath remarked that for n = 2, every non- 
singular matrix is a DOTU-matrix can be seen with little difficulty. This is essentially, the 
basis of the proof due to Heilbronn given in Hardy and Wright [83]. Narzullaev!!7-1!8.119 
(1968, 1969, 1975) established the corresponding result for n = 3. 

Gruber’!:7* (1970, 1976) and Ahmedov*:>:® (1972, 1975, 1977) showed the existence 
of non-singular matrices which are not DOTU. The proof depends upon the existence of 
totally real algebraic number fields of degree n whose discriminant is < n”. The existence 
of such fields is guaranteed by some results of class field theory. Skubenko!*? (1981) gave 
an example for n = 2880. 

For results on the measure of the set of lattices for which Minkowski’s conjecture holds 
and for a discussion of related results see Section (xvi) of Gruber and Lekkerkerker [74]. 
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§2.2 Chebotarev’s Theorem and Asymptotic Estimates 


Lot of work has been done for obtaining weaker results which are valid for all n. Let uy, 
be the infimum of the numbers jp such that any lattice A is a covering lattice for 


Ix1..-Xn| Sw d(A). 


Then Minkowski’s conjecture states that wu, = 27”. 

Chebotarev>” (1934) obtained the interesting result uw, < 2-"/2 for all n. The proof is 
simple and elegant. It essentially depends on the fact that the union K of two translates 
S+(1,..., 1) of the hyperboloid S : |x, ...x,| < 1 contains the box maxj<j<n|xi| < 2 to 
which the Minkowski’s convex body theorem is applied. Several authors have worked on 
obtaining better estimates of 1, by using a convex body of bigger volume contained in K 
and applying Minkowski’s theorem, or applying Blichfeldt’s theorem and its refinements. 

Writing LZ, = 20) 2y-| and n = limn—sooNn, We Can Summarise these improvements in 


n 
terms of values of 7 as follows 


4.43656 — Davenport*?1946) 
2(2e —1) ~ 8.873127 | Woods!*°(1958) 
3.0001(2e — 1) ~ 13.310134 Bombieri?>(1963) 
3(2e — 1) ~ 13.309690 Gruber®5 (1967). 


2e — | 


é 


é 


¢ 


Skubenko!37-!38 (1977, 1978) obtained ny = e~2n!/3 log-?/3 n for large n, which is con- 
siderably better than the earlier estimates, since here the limit n equals oo. For improve- 
ments in this direction see Narazullaev and Skubenko!*° (1979), Mukhsinov!!3:!!4 (1981), 
Andrijasjan, [lin and Malyshev’ (1986). Mukhsinov!!> (1981) and Ilin®> (1991) have 
obtained improved estimates for small values of n. 


§2.3 Asymmetric Inequalities 


It is interesting though not surprising that some best possible general results which can 
be regarded as asymmetric versions of Minkowski’s conjecture have been obtained with 
simple, elegant methods. Chalk? (1947) showed that every lattice A is a covering lattice 
for the region 

Xp...X%, <d(A), xj >O for lj <n. 


Considering the integral lattice, we see that this result 1s best possible. 
Cole?’ (1952) showed that every lattice A is a covering lattice for the region 


| 
X1-+-Xn—-1Pal = 5a(A), x, >0° for sy sn — 1. 


These results suggest a natural CONJECTURE: 
Every lattice A is a covering lattice for the region 


Recep helo a2 aA). xpS0 for Lapses 


(For r = n, this is Minkowski’s conjecture). 
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For n = 2, there are only two cases, which are taken care of by Chalk’s theorem and 
Cole’s Theorem. For n = 3, the missing third case was confirmed by Bambah and Woods!4 
(1977). The above conjecture is open forn > 4 and 2 <r <n except forr =n = 4,5, 
which are the confirmed cases of Minkowski’s conjecture. 

Davenport*” (1948), in an effort to establish an analogue of Minkowski’s conjecture for 
(ternary) quadratic forms (which topic is the subject of our exposition in the second part) 
proved another type of asymmetric inequality. He showed that any plane lattice A is a 
covering lattice for 


—od(A) < xjx2 < pd(A),p > 0,0 > 0, po > 1/16. 
Sawyer! (1950) gave a simple geometrical proof which is a modification of his proof 
mentioned earlier for p = o = 1/4. Woods!*8 (1981) proved the corresponding results for 
the regions 

—od(A) <x\|x2|< pd(A), po = 1/16 


and 
—od(A) < x)x2|x3| < pd(A), po > 1/64. 


The corresponding best result for the region 
—od(A) < x1x2x3 < pd(A) 


has still not been proved. Grover (1989) proved it for po > 1/16.81, instead of the 
expected po > 1/64. He could prove it with po > 1/64, for lattices with homogeneous 
minimum zero. 


§2.4 Other Directions 


Several investigations in other directions have been made for the product of homoge- 
neous linear forms. For example, bounds on the non-homogeneous minimum related to 
the homogeneous minima of product of linear forms have been obtained by Birch and 
Swinnerton-Dyer2? (1956), Gruber’? (1970), Bakiev, Pen and Skubenko? (1978), Bakiev® 
(1981), Skubenko and Bakiev!*° (1979). Forms associated with algebraic number fields 
have been discussed in some detail by Clarke** (1951) and Davenport*! (1952). 

For introductory reading one can refer to §§24.7-10 of Hardy and Wright [83] and 
§11.4 of Cassels [28]. For a detailed description of many results and history, Gruber and 
Lekkerkerker [74] is excellent. 


§3 Non-homogeneous Indefinite Quadratic Forms 


Minkowski’s theorem for n = 2 can be formulated in terms of non-homogeneous minima 
of indefinite binary quadratic forms. Let L1(x, y) =ax+ By, L2(x, y) = yx + dy be two 
real linear forms of determinant A = a6 — By £ 0. Let 


O(x, y) = Li(x, y)Lo(x, y) = ax? + bry + cy”. (3.1) 
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Then Q(x, y) is an indefinite quadratic form of determinant D = ac — b?/4 = A?/4. 
Given real numbers c), c2 we can find reals xo, yo such that 


Li (xo, yo) = ¢1, L2(x0, yo) = c2 (3.2) 


so that 
(Li +c1)(L2 +2) = Q(x + x0, y + yo). (3.3) 


Conversely, given an indefinite binary quadratic form Q(x, y) and real numbers xo, yo 
we can find real linear forms L;, L2 and real numbers cj, cz such that (3.1), (3.2), (3.3) 
hold. An equivalent alternative way of stating Minkowski’s result is the following: 


Theorem 1 Let Q(x, y) be an indefinite binary quadratic form of determinant D # 0. 
Then given any real numbers xo, yo there exist integers, x, y such that 


1 1/2 
|O(x + x0, y+ yovl < (5121) (3.4) 


Equality is needed in (3.4) iff Q(x, y) ~ pxy, p # Oand (xo, yo) ~ (5, x) (mod1). 


Remark 1 We say that two quadratic forms f(X) = f(x%1,...,x%,) and g(X) = 


2(X1,...,Xn) are equivalent, and we denote this by f ~ g, if there exists a unimodu- 
lar transformation T such that f(7X) = g(X). Further we say X = (x1,...,X%n) ~ 
(y1,--+. Yn) = Y(mod 1) if Y — TX is a point with integer co-ordinates. A natural ques- 


tion is to generalize the above result to indefinite quadratic forms in more than two variables. 
Blaney”! (1948) proved the following result 


Theorem 2 Let OQ = Q(x1,..., Xn) be an indefinite quadratic form in n variables with 
determinant D # 0. Forany given y anumberT = I'(y, n) exists such that for any Q and 
any real numbers a1, ..., @n, integers x|,..., Xn exist such that 
yID|'" < Q(x +01, ....%n +n) < PID”. (3.5) 
The proof is by induction on n. In particular it implies that there exists a constant 
C = C(n) such that for real a,..., @, there exist integers x;,..., X, such that 
JO(x1 +01,...,%n +On)| < CID)". (3.6) 


His proof shows that we can take C = 2”~?. 

The inequality (3.6) of Blaney was strengthened by D.M_E. Foster® (1956) by showing 
that we can take T(y,n) = y + Cny!/*" + C,, where Cy, Cy are positive numbers 
depending only onn. C.A. Rogers!#! (1952) using the minima of positive definite quadratic 
forms showed that the inequality (3.6) is true with C = inYns where y,, 1s the Hermite’s 


2 
: n n 
constant. Since y, < dn ~ => we have C < B, ~ Ge. 
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Davenport’ (1948) extended Minkowski’s theorem to indefinite ternary forms 
by proving 


Theorem 3 Let Q(x, y, z) be an indefinite quadratic form in three variables with deter- 
minant D # 0. Then given any real numbers xo, yo, Zo there exist integers x, y,z such 
that 


a7 1/3 
|\O(x +x0,¥+ yo,z2+ 20) < (1) . (3.7) 
Equality is needed in (3.7) iff 
O(x, y,z) ~ p(x? + 5y* — 27 + 5yz+ zx), p #0 (3.8) 
and 
( ) (mod 1) 
~[-,-,- ) 
X0, YO, <0 7°99 


For the proof of (3.7) Davenport* obtained the following generalization of Minkowski’s 
theorem. 


a7 


Theorem 4 Let L,, L2 be linear forms as above. If p > 0,0 > Oare such that po > 16 


then given any reals c\, C2 there exist integers x, y such that 
—o |A| < (Li +¢1)(L2 + €2) < plAl. (3.9) 


We shall discuss this result in greater detail in Section 5. 
TOR) = aa Aj jXjXj, Ajj = aj; iS an indefinite quadratic form of deter- 
minant D = det(a;;) # 0, then Q can be written as 


QO(x1,...,%n) = LE +s +17 - 12.) + - Ly, 
where 1 < r,s <n,rt+s =n,Lj,...,Ly4s5 = Ly are linear forms in x1,...,X, of 
determinant A 4 0, where |D| = |A|?. The numbers r,s are uniquely determined by Q. 


We say Q is a quadratic form of type (r, s) and the number o = r — s is called the signature 


of QO. 


Let C;,; denote the infimum of all those constants C such that for any indefinite quadratic 


form Q(x|,...,X,) of type (r, s), determinant D # O and real numbers c),..., cy, there 
exists integers x1, ..., x, Such that 
O(x1 +c1,.-..%n $en)| < (CID). (3.10) 


One of the basic problems in Geometry of Numbers is to determine C;,; and to determine 
all those quadratic forms Q of type (r, s) for which equality is needed in (3.10) with C = C,.; 
for some c},...,C,. Such forms we shall call critical forms. Results of Minkowski and 
Davenport cited above imply that C),; = i, Cot =]Cpo = yas 

Birch!? (1958) generalized Minkowski’s result to all forms of signature 0, by proving 
that C,, = i for allr > 1. More explicitly, he proved 
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Theorem 5 Let Q(x|,...,X2,) be an indefinite quadratic form of signature 0 and 
determinant D # 0. Then for any real numbers c,,..., C2r there exist integers x), ...,X2r 
such that 
| 1/2r 
lO + C1, 2+, 9p + C2)! = (4!21) (3.11) 


Equality is necessary in (3.11) iff 


QO ~ p(x1x2 +--+ + X27-3X27-2 + 2x2r-1X27), p FO 


and 


1 1 
(ets. 4627) ~ (0,..-,0 — 


5 ) (mod1). 


His proof is by induction on r and by dealing with the zero forms and non-zero forms 


separately. A quadratic form O(x),..., Xn) is said to be a zero form if there exist integers 
X1,...,X, not all zero such that O(x1,..., X,) = 0, otherwise it is called a non-zero form. 
If O(x1,...,Xn) = Qn iS a quadratic form inn variables we define the inhomogeneous 


minimum M,(Q,,) by 


ChyeeeCn (41=C (mod 1) 


M1(Q) = sup | inf JOC... -.an} (3.12) 
The results proved by Birch are 


Theorem 6 Let Q,, be an indefinite non-singular quadratic form inn variables with incom- 
mensurable coefficients. Suppose that either (i) n > 3 and Q,, represents arbitrarily small 
non-zero values, or (ii) n > 4 and Qn represents zero properly. 

Then M,(Q,,) = 0. 


Theorem 7 Let Q2, be a quadratic form in 2r variables of signature zero and determinant 
D £0, which does not represent zero properly. Then 


1/2r ( 5 (r—4)/3 
min [{ 1, (2) ‘ 
6 


Theorem 8 Let Q2, bea rational quadratic form of signature zero and determinant D # 0 
that represents zero properly. Then 


M1(Q2r) < > 


1/2r 
M1(Q2r) < > 


Theorem 6 is derived from Blaney’s theorem stated above and the following results of 
Oppenheim!2°-!27 proved in 1953. 
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Theorem 9 /f Q(x, ..., Xn) is an indefinite quadratic form such that for every € > 0, the 
inequality 
O< Q(x],...,Xn) <€ 


is solvable in integers x,,..., Xn, then forn > 3, the inequality 


0 < —Q(x,...,X%) < € 


is also solvable in integers x\,..., Xn for every € > 0. 


Theorem 10 /f the indefinite, non-singular quadratic form Q(x\, ..., Xn) is an incommen- 
surable form, which represents zero properly then the inequality 


0 < |QO(x),...,X%n)| <e 


is solvable in integers. 


For the proof of Theorem 7, using results from ternary and quaternary forms, Birch 
developed a reduction in which Q2,; represents a binary quadratic form of small determinant. 
More precisely he proved. 


Theorem 11 Let Q2, be a quadratic form of signature zero in 2r > 4 variables of deter- 
minant D € 0 such that Q2,; does not take small values, then 


Oo, ~ a(x, +.azx2 +--+)? — B(x2 + b3x3 +--+)? + Or--2(x3,.-., X2r) 
= W(x + a2x2,...,x2 + :b3x3 + ---) + Qar_2(x3,..-, Xar), 


where w(x, y) = a(x + azy)* — By? is an indefinite binary quadratic form of determinant 
—6, with 
6° <A|D| 


where x is a suitable constant. 


To derive the result by induction he proved the following asymmetric inequality for 
non-homogeneous indefinite binary quadratic forms. 


Lemma 1 Let Q(x, y) be an indefinite binary quadratic form of determinant —5. Then 
given any real numbers xo, yo and tL, we can find (x, y) = (Xo, yo) (mod 1) such that 


LO(x, y) + pl < max(27/2g'/? | 5/4 1/2), (3.13) 
If Q ~ Py =x*+xy— y* we can solve the stronger inequality 
|P2(x, y) + p| < max(A, |w?/9). (3.14) 


For the zero rational quadratic forms Birch proved the following reduction result: 
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Theorem 12 Let Q2, bea rational quadratic form of determinant D # 0 of signature zero 
that represents zero properly. Then 


Oo, ~ Ay(x, + @)2x2 +--+ +)x2 + H2(x3 4+ A24x4 +--+) x4 
+++ + Ain (X2m—1 + Am.2mXIm + °° -)X2m + Q2n-2m, 


where |ajj| < 5, H; > 0, a is an integer and either m = n in which case Q2n-2m IS 
I e . e e 
omitted orm =n — 1 orn —2 in which case Q2n—-2m is a binary or a quaternary non-zero 


form of signature zero. 


From these results one easily deduces the following 


Corollary 1 /f Qo, is a rational quadratic form of determinant D # 0, signature zero and 
which represents zero, then we have 


Ow~ Wixi tax2+...,x2 + Bx3 +---) + Qar-2(%3,..-, X2r), 


where either w = H(x; +a2x2+---)x2 with (4H)*" < ;|D| or w is binary form with det 
d with \d|" < ;|D\. 


Theorems 7 and 8 follow easily by induction on r and using Theorem 11, Corollary 1 
and the Lemma 1. 
Watson!42 (1960) extended Theorem 6 of Birch by proving 


Theorem 13 Let f(x|,...,Xn,) be an indefinite non-singular quadratic form of determi- 
nant D # 0 inn > 3 variables which takes arbitrarily small non-zero values for integers 
X|,...,Xn. Then for any real a,c\,..., Cy, and € > O, the inequality 


lf(x1te,...,%, +e) -al <e 


has a solution in integers x\,..., Xn. 


Theorem 14 Let f(x|,...,Xn) be an indefinite non-singular quadratic form with n > 3 
which is a multiple of a rational form and represents zero non-trivially. Let c,,..., Cy be 
real numbers which are not all rational. Then for any real a and € > O, there exist integers 
X},...,X, such that 

[f(xy tcl,..., xX, +c,) —al] <e. 


Theorem 13 is proved in the same manner as Theorem 6. Theorem 14 after a suitable 
reduction is derived from the following result on uniform distribution mod 1. 


Lemma 2 /fP = P(x|,..., Xn) tsa polynomial whose coefficients other than the constant 
term are not all rational, then the fractional part of P for integers x; takes values which are 
everywhere dense in the interval (0, 1). 


Next major development in the determination of C,.; was done by Watson!*? in 1962 
who determined C,., for all r,s such thatn = r +s > 21 1.e. he solved the problem 
completely for forms inn > 21 variables. The precise result proved by him is 
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Theorem 15 Let Q(x1,..., Xn) be an indefinite quadratic form inn > 21 variables of 
determinant D £ O of type (1, s) and signature 0 = r — s. Then for any real numbers 
C1,...,Cn there exist integers x1,..., Xn Such that 
IO +1,-.-,%n ten) S (Crs|/DI" (a3) 
where 
1/4 ifo =Oor +1 (mod 8) 
_4fl/3 ifo= + 2 (mod 8) 
ii 1/2 ifo= + 3 (mod 8) ne) 
l ifo = + 4 (mod 8) 

Equality is needed in (3.15) for each r, s for suitable Q and c\, ..., Cn. 

Watson reduces the problem to consideration of values of suitable integral quadratic poly- 
nomials inn variables. A well known conjecture of Oppenheim states that 1f Q(x), ..., Xn) 
is an incommensurable indefinite quadratic form of determinant D 4 0 inn > 5 variables 
then O(x1,..., X,) takes arbitrarily small values. This conjecture was proved to be true for 


forms inn > 21 variables in a series of papers by Davenport, Birch and Ridout (see [44]) 
using analytical and other techniques. 

Thus the result for incommensurable forms follows from Theorem 6. For commensurable 
forms, by homogeneity we can assume Q to be a primitive integral form. Sincen > 21, Q 
is a zero form. If cy, ...,C, are not all rational, then again (3.15) is true with any € > 0 on 
the right hand side by Theorem 14. We may therefore assume that c; are rational and write 


Cj = —<1 a I, »n 
q 
where u;, g are integers, gq > O and (u1,..., Un, g) = 1. We can therefore write 
F(X, ---,%n) = OX +01, .22,Xn +n) = Q(X1,. Xn) + K | (bx +-+++bnXn) +c 
where c is arational constant, b},...,b,, K are integers with K > 0,(K,b),...,0,) = 1. 


The values of f for integral x; are of the form c + K~'N. Any such value is assumed by 
f if and only if the Diophantine equation 


KQ(x1,.-., Xn) Fox} +--+ + dDnxXn = N (3.17) 


is soluble in integers x;. 
If one can show that (3.17) is soluble for at least one of every H consecutive integers, 
then it will follow that 


| H 
(C,.s|D]) i < a 
It is known that (3.17) has solution iff 
KQ(xj,..-, Xn) + 01x) +--+ + bnxXn = N(mod p') 


has a solution for all prime powers p’. Using p-adics it is easy to see that we need to consider 
only finitely many prime powers which are related to the determinant of Q. A careful 
analysis of this reduces the problem to one concerning distribution of quadratic residues. 
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Watson in fact by this method also proves a more general gap theorem, namely he obtains 
the smallest value of 6 depending only on o,n, D such that for any @ 


a< Q(x; +c1,-.-..,Xn +Cn)<a+B 


always has a solution for any Q of signature o. We shall discuss this in greater detail in 
Section 6. 

It may be remarked that the conditionn > 21 is not very essential for this p-adic argument 
but avoids lot of tedious computation. However, as he himself says, the method will not 
work for small n. 


Watson’s Conjecture: 
The formulae for C,, given in (3.16) hold for n > 4. 


Dumir (1967) proved this conjecture for n = 4. For n = 5, the conjecture was proved 
by Hans-Gill and Madhu Raka’?:®° (1979, 1980). They also obtained all the critical forms. 
Their methods for n = 4,5 like that of Davenport, are based on reduction of the problems 
to forms in fewer variables and obtaining and using suitable asymmetric inequalities for 
such forms. 

It may be remarked that the conjectured values of C;, for n > 4 depend only on the class 
of o =r -—-s (mod 8). 

Next obvious step was to prove Watson’s conjecture for quadratic forms in any number 
of variables having a fixed signature o = 0,+ 1, + 2,+ 3, and 4. For forms of signature 
0, it was already done by Birch. Madhu Raka®?:94-% (1981, 1983 a,b) using the method of 
Birch proved Watson’s conjecture for all forms of signatureo = +1,+2,+3,+4. Thus 
together with Birch’s result for 0 = 0, Watson’s conjecture was known for a complete set 
of residue system mod 8. 

After this, in order to prove Watson’s conjecture completely, one just needed to prove that 
Crs = Cp y ifr+s =r’'+s’ =nandr—s =r’ —s' (mod 8). This was proved by Dumir, 
Hans-Gill and Woods™ (1994). Margulis!°4 in 1987 proved Oppenheim’s conjecture. He 
in fact proved 


Theorem 16 /f f(X) ts an indefinite incommensurable quadratic form of determinant 
D #£0inn > 3 variables, then for any € > O there exists X in Z” such that 


0<|f(X)| <e. 


Thus for incommensurable forms f(X) inn > 3 variables, Theorem 13 implies that for 
any reala,C = (c},...,Cy,) ande > O 


[f(X + C)-al <e (3.18) 


has a solution in X € Z”. 

Further if f(X) is commensurable and n > 5 and C is in R” — Q”", then again (3.18) is 
true by Theorem 14. 

Thus to prove that C,;, depends only on signature mod 8, we need to restrict ourselves to 
only primitive integral forms Q(X) andC € Q”. Using results on congruentally equivalent 
integral forms Dumir, Hans-Gill and Woods™ in 1994 proved the following general theorem 
on non-homogeneous indefinite quadratic forms with integral coefficients and having the 
same signature modulo 8. 
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Theorem 17 Leta < B be real numbers and let n > 6 be an integer. Let o, oo be integers 
such that |a| <n, |o9| < n,o = oo (mod 8) and n = o (mod 2). Suppose that for each 
integral quadratic form g(X) in n variables of determinant D # O and signature og and 
forany C € Q", the inequality 


a|D|'/" < g(X +C) < pID|'”” (3.19) 
is solvable for X in Z". 


Then for any integral quadratic form f (X) inn variables of determinant D and signature 
ao and forany C € Q", there exists X in Z” such that 


a|D|'/" < f(X +C) < BID|'/". (3.20) 


For convenience if we denote by C,,,,,, the constant in Watson’s conjecture for quadratic 
form of signature og in n variables then we have 


1/4 ifog =Oor +1 (mod 8) 


_fl/3 ifoo= + 2 (mod 8) 
Crs = 1/2 ifoo= + 3 (mod 8) on 
| ifoo= + 4 (mod 8) 
If we take |o9] < 4,-a@ = B = (Cy o,)!/", Watson’s conjecture follows from 


Theorem 17 and the results of Birch and Madhu Raka cited above. 


§4 Positive Values of Non-homogeneous Indefinite 
Quadratic Forms 


Since indefinite quadratic forms take both positive and negative values one can consider 
only one sided inequalities and ask: 


Do there exist numbers [° depending only on r,s such that if O(x,...,X,) is a real 
indefinite quadratic form of type (r,s) and determinant D # 0, then for any given real 
Cj,...,C, there exist integers x}, ...,X, Satisfying 

0 < Q(x) +¢1,...,%n + en) < (TID)? (4.1) 


As remarked earlier the existence of such I was proved by Blaney?! (1948). 

Let I’; denote the infimum of all such constants [ for which (4.1) has a solution. The 
problem is to evaluate I’,., for different r, s and to determine all those forms Q andc),..., Cy 
for which equality is needed in (4.1) with Fr = Ts. 

For indefinite binary quadratic forms this result was proved by Davenport and Heilbronn* 
(1947) who showed that 


3 


0 < O(x +x0, y+ yo) < (4/D]})!” (4.2) 


always has asolution if Q(x, y) is an indefinite binary quadratic form of determinant D # 0. 
Clearly equality is needed in (4.2) if 


Ox, y)=xy, xX =yo=O0 
Thus Ty.) = 4. 
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Blaney”? (1950) showed that 2; = 4 and also determined all the critical forms. This 
result was rediscovered by Barnes!® in 1961. 

l'}.2 = 8 was proved by Dumir*® in 1967 thereby completing the result for ternary 
forms. 

Dumir*?:>° (1968; a,b) proved that 73.) = 16/3, 22 = 16. Dumir and Hans-Gill>? 
(1981) proved that ';.3 = 16, thereby completing the results for quaternary forms. 

I32 = 16,4. = 8 were proved by Hans-Gill and Raka®!:8? in 1980 and 1981 respec- 
tively. Methods used in the proofs of the above results are purely arithmetical, relying on 
results for asymmetric, homogeneous and non-homogeneous forms in 2, 3, 4 variables, 
some of which were known and some were discovered for purpose of applications. 

P23 = (7/4)° was proved by Bambah, Dumir and Hans-Gill!? in 1984 who in fact 
proved more which we shall discuss later. Bambah, Dumir and Hans-Gill!®!!-!? in a series 
of papers in 1981, 1983, 1984, using a reduction similar to that of Birch!? (1958) cited 
earlier, proved that forn > 6,—1 < o < 3, where o = r —S is the signature of the 
quadratic form Q(x1,..., X,) of types (r, Ss), 


On 
~ jolt 
For incommensurable forms they use method similar to that of Birch as in Theorem 11 


and the proof is completed by induction by using the following analogue of Lemma | proved 
by them. 


(4.3) 


Dys 


Lemma 3 Let ¢(x, y) be an indefinite binary quadratic form of determinant —é which 
does not represent zero trivially. Then given any real numbers xo, yo and > O, we can 
find (x, y) = (Xo, yo)(mod1) such that 


0 < d(x, y) +p < max (2V8, des J). (4.4) 
If Q~ Py =x? +xy — y*, we can solve the stronger inequality 
0 < Po(x,y) +m < max [2.1, (/5u7)'71. (4.5) 


For commensurable forms, which for n > 5 are necessarily zero forms, though they use 
the reduction of Birch, but the proof for this case is not done by Birch’s method but by a 
gap theorem of T.H. Jackson®® (1971) for zero forms. It relies very heavily on a Lemma of 
Macbeath?? (1951) on the non-homogeneous minima for a parabolic region. Macbeath’s 
Lemma can be formulated as 


Lemma 4 Let a, B, A be real numbers with a 4 0. Let 2h, K be positive integers such 
that 


|n — k*|oe|| + ; <A. (4.6) 


Further suppose that either |a| # a orp F + (mod ts 2a) Le. B— 4 is not an integral 


combination of t and 2a, then for any real number v, there exist integers x,y satisfying 


O<x+PBy+tay*+v <A. (4.7) 
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If Q(x, ..., Xn) is a zero rational form, by using Birch reduction and homogeneity, we 
can suppose 


O(X1,..-, Xn) = (X1 + anx2 +---)X2 + G(X3,..-, Xn). 


Suppose ¢ represents a number —a, primitively witha < 0, satisfying a suitable inequal- 
ity. In this case we can assume 


O(X3,-..,%Xn) = —axzt+... 


so that 
OX i5.63- 55 Ky = (I 4+ anx2+...)X2 —axz+... 


Suppose we are interested in solving an inequality of the type 


0 < O(a) +c1,...,%n +¢n) < (AID )*/" =, say. 
We can take |c;| < 5. In case d > 5 and cp £ O, then for x} = x,x2 =... = 
Xn = 0, O(e + €},C2,..., Cn) = XC2 + p, for some pp. Now choose integer x such that 


l 
O< Qe +€1,€2,..-,€n) = XC2+ MS leo] SS <4. 


Again if cp = 0 andd > | we are through as above if we take x2 = 1,x3 =... =x, = 0 
and x; suitably. 

In other cases we take x27 = cz if O < |c2| < 5X2 =) 1c: =] 0.4) Sx = 
Vi. x4 Si = X= 0, 80 that 


O(x, +1, x2 +€2,..-,X%n ten) = [xX +0) +42x2 +43(y +3) + 4x4 +...) x2 
—a(y+c3)* +... 
= xx2 + py —ay* +" 


for some f’, v’. Therefore, 
O< O(%1 4+ ¢C],...,Xn +n) <d 
is soluble if there exist integers, x, y such that 
0<xx2.+ fy —ay’+v<d 
/ p’ d 


Gee ayes a <—. 
|x2| |x2| lx2| | x2| 


This is an inequality of the type (4.7) and hence is soluble if we can satisfy the conditions 
of Lemma 4 for suitable h, k. 
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It was conjectured by Bambah, Dumir and Hans-Gill!® that for n > 6, 


Qt 
—— if jo| <3 
“i = 
ee (4.8) 
a ifo =4 
and 
Pps = Py s! 
if 
r+s=r'4+s’=n,r—s=r —s'(mod 8) (4.9) 


except perhaps for a finite number of exceptions. 

Mary Flahive®™ (1988) proved the conjecture of Bambah, Dumir and Hans-Gill forn > 21 
by the technique similar to that of Watson!*? (1962). Aggarwal and Gupta!-*:3 (1988, 
199 1a,b) determined [’,,; foro = 2 andn > 8,o = —3 andn > 9ando = 4 withn > 6 
and confirmed the conjecture of Bambah et al. in these cases. 

From the general theorem (Theorem 3.7) of Dumir, Hans-Gill and Woods proved in 
1994, the second part of Bambah et al. conjecture follows by taking a = 0, B = te from 
the known results for forms of signature —3 < o < 4. 

This argument determines I’,.5 except for 72.5, 2.4, [1.4. 

I'2.5 = 32 was proved by Dumir and Sehmi> (1994) and [2.4 = 64/3 was proved by 
Dumir, Hans-Gill and Sehmi®! in 1995; confirming the conjecture of Bambah et al. 

This leaves open only the determination of ';.4. It is conjectured that [y,.4 = 8. Dumir 
and Sehmi>? (1994) proved that 8 < T'y.4 < 16. Madhu Raka and Urmila Rani” (1997) 
have improved this to 8 < I'j.4 < 12. Madhu Raka and Urmila Rani observe that it may 
be possible to extend this method to show that '}.4 < 32/3, but the number of cases to be 
dealt with is very large. There is difficulty in applying Macbeath’s Lemma if one attempts 
to obtain a smaller upper bound. 

It may be noted that for large values of n, the evaluation of C;,5, ';.5 is relatively easy. 
For small values of n a detailed and careful analysis is needed. However it may be noted 
that after the proof of Margulis of Oppenheim’s conjecture, for n > 5 we need to deal with 
rational forms only and these are zero forms by Meyer’s theorem; so Birch’s reduction for 
zero rational forms and Macbeath’s Lemma can be used. 


§5 Asymmetric Inequalities for Non-homogeneous 
Indefinite Quadratic Forms 


As described in §3 and §4 to determine the constants C,,; and I’; one needs to prove 
more general asymmetric inequalities for non-homogeneous indefinite quadratic forms. 
We briefly describe these in this section. 

Let Q(x1,..., Xn) be an indefinite quadratic form of determinant D # 0 of type (r, s). 
Let t be real. One is interested in finding a function f(t) (which will depend on,r, s also) 
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such that for any real cy, ..., Cy, there exist integers x;,..., x, such that 


—t(f(t)|DI)'/" < Oa +c1,...,%n ten) < (FID). (5.1) 


One would be interested in finding the best f(t) for each t which will work for all 
quadratic form Q of type (r,s). However for the purpose of applications it will be more 
appropriate to find continuous functions f(t) which will work for ¢ in a given interval but 
which need not be best possible for all ¢ in that interval. 

Determination of such f(t) is related to the non-homogeneous minima of the region 


2 2 2 2 
SE Ny Oe ye Ky a ee a 


Theorem 4 proved by Davenport*? in 1948 shows that for indefinite binary quadratic 
forms Q(x, y) we can solve (5.1) with 


1] 
FO ae 
For t = 1 it reduced to Minkowski’s result. 

Blaney2* (1950) improved upon this result. For binary forms there is no loss of generality 
in assuming that t < 1. Blaney reproves Davenport’s theorem and shows that the result 
is best possible for tf = 1 andt = aa m = 1,2,3,... and obtained all the forms for 
which equality is needed for these t. For 0 < t < 1/3, Blaney proved that the inequality 
(5.1) is soluble with f(t) = [(t + 1)(1 +. 9t)]~'/ and the inequalities are best possible for 
= wo m= 1,2,...,andt = 0. Fort < 0, he obtains a suitable function f (t) which is 
best possible for only finitely many values of t. The first and third results are adaptation of 
the geometric method of Davenport while the other proofs are arithmetical in nature. These 
results of Blaney have been extensively used in literature for determining different types of 
non-homogeneous inequalities for indefinite forms in larger number of variables. Dumir 
and Grover” (1986) improved upon some of the results of Blaney for t lying in different 
sub-intervals and showed that their results were best possible for infinitely many new values 
of t. Grover and Raka®’ (1991) obtained further improvements for some range of f. 

Dumir>! (1969) determined such a function f(t) for indefinite ternary forms of type 
(2,1) fort > 0. f(t) is specified as a cubic polynomial in six different intervals. For 
example f(t) = 8/(1 +t)? for 7 < t < oo. The results is best possible for eight values 
of t namely t = 0, 1/7, 3/5, 1,9/7, 3/7 and oo. The earlier best known values were f(1) 
due to Davenport*”, (0) due to Blaney?”, Barnes!8, and f (00) due to Dumir*®. Hans-Gill 
and Raka’© (1979) were able to sharpen Dumir’s result in the interval 1/7 < t < 1/3 
by utilizing known inequalities for the values of homogeneous and inhomogeneous binary 
quadratic forms. Hans-Gill and Raka’’ (1980) also gave a function f(t) forO <t < 1 
such that the inequality 


t(f (t)|DI)'? < Q(x + x0, y + yo,.z +20) < (FID)? 


is soluble for quadratic forms Q of type (2,1). The result is best possible only for t = 0 
and 1/9. 

Results of similar nature for other quadratic forms of special type have been determined 
by Dumir, Hans-Gill and Raka and have been applied for determining C,., and T;.;. 
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§6 Gaps between Values of Non-homogeneous 
Indefinite Quadratic Forms 


Watson!*3 in his 1962 paper also considered another type of asymmetric inequality for 
non-homogeneous indefinite quadratic forms. Let Q(x1, ..., X,) be an indefinite quadratic 
form of determinant D £ 0, and type (r, s). Watson considered inequalities of the type 


a< O(x%1 + c],...,%n ten) <at+fB (6.1) 
and defined G = G(Q) as the least upper bound of the real numbers f for which for 
suitable real a = a(Q, Bf) the inequality (6.1) is not soluble in integers x1, ..., x, for some 
C1,...,€n. Clearly the inhomogeneous minima M, of Q satisfies. 

2M, <G. 


Watson proved the following results: 


Theorem 18 Let Q(x,..., Xn) bea real indefinite non-singular quadratic form inn > 21 
variables, of determinant D, type (r, s) and signature o. Then 
G <2|D\'/". (6.2) 
Equality in (6.2) is needed iff either 
(i) O~xt+---+x? — x? — +++ — x2, (c1,-..,¢n) ~ (1/2,..., 1/2) 
or 


(ti) n even, 8 fo. 


Q~(sgnyo > E(xgi-z,....x81)+ > x2i-1%2, 
0<8i <|o| la|<2i<n 


(c1,.--,€n) ~ (O,...,0) 


where 
\ 4 
ES EX ox Key A >? + (2xj44 +X) —X1 — x2 — X3 = 4). 
i=] 
Watson in fact showed that forn > 21 if f(x1,...,%,) = O(x1 +¢1,..., Xn +p) 1S not 
a multiple of a rational form, then we have 
pls 


T.H. Jackson®® (1971) improved upon Watson’s result for zero forms for all n by proving 


Theorem 19 /f Q(x), ..., Xn) isanon-singular indefinite zero form of determinant D, then 
G(Q) < 2|D|'/" (6.4) 


and the result is best possible for all n. 
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He determined all the critical forms also. This proof is by induction on n. Jackson 
conjectured the result to be true for n > 4, without the hypothesis that Q(x1,..., Xn) is a 
zero form. The result is false for n = 2, 3. 

In 1983, Bambah, Dumir and Hans-Gill!! proved the following weaker version of con- 
jecture of Jackson for non-zero indefinite quadratic forms of signature 0, £1 or +2. 


Theorem 20 Let O(x1,..., Xn) be a real indefinite quadratic form of determinant D 4 0 
of signature 0, +1 or £2. Let |a| < |D|!/". For any real c\,..., Cn there exist integers 
X1,...,Xpy Such that 


\O(x, +¢],.--,Xn t+ Cn) —a| < |D|!/". 


It may be observed that in view of the result of Margulis on Oppenheim’s conjecture, 
Jackson’s conjecture is true for non-zero forms inn > 4 variables which are not multiples 
of rational forms using Watson’s Theorem, cited above. Since by Meyer’s theorem, rational 
indefinite quadratic forms inn > 5 variables represent zero, Jackson’s conjecture is true 
for n > 5. Thus Jackson’s conjecture needs to be verified for non-zero rational forms in 
four variables only. 


§7 Isolated Minima of Non-homogeneous 
Indefinite Quadratic forms 


A natural question is whether the constants C,.;, 5; are isolated i.e. if we omit the forms 
equivalent to the critical forms for C;.5({;.5) can the inequality 3.10 (4.1) be solved for 
some C < C,,(I,;5). It is known that C; ; = 1/4,1°).; = 4 are not isolated i.e. for any 


€ > 0 the inequality 
1 l/n 
|O(x + x0, y+ yo)| < (3 = °) D') 


is not satisfied for some O(x + x9, y + yo) X (x + 5)(y a 5). 
Similarly, for any ¢ > O the inequality 


0 < O(x+ x0, y+ yo) < (4—«)|DI)'” 


is not soluble for some O(x + x0, y + yo) # (x + O)(y + O). 

However it is known that for n > 3, the constants C;.,, 5 are isolated (see Vulakh!4! 
(1985)). Davenport?” (1948) himself had shown that the constant C2; = Cy2 = an 
is isolated. The exact value of the next two successive minima for ternary forms were 
determined by Barnes!®-!” (1954, 1956). If we consider the complete spectrum of values 


m(Q,Cc\,...,Cn) = inf OO teal for indefinite quadratic forms Q(x), ..., Xn) of 
type (r, s) and determinant D, where the infimum is taken over all integers x), ..., x», then 
M,|Q|= sup m(Q,c},...,Cn) 


Ch avers Cn 


and 
Crs = uP: 
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Vulakh!*! (1985) has shown that for n > 3 the set of numbers m(Q, c1,..., Cy) has zero 
as its only limit point. We arrange the numbers m”(Q, cj, ..., Cn) in descending order. 
The first term is Cys. 

Let C i) be the kth term of this sequence. Then C ik) is called the kth successive minima 
for non-homogeneous quadratic forms of type (r, 5). Following results are known 


Z 2 
cy) =c} =4 — Barnes!6(1954) 
CG, = Co = i Barnes!/(1956) 

(2) (2) l 
C = C = + 

ae 23 ®t Dumir and Sehmi>’(1995) 
8 =~¢® 21 

Se 2.3 8 
cP) =c?} =1 — Dumir and Sehmi* (1994) 
Cy) = C ie = i Raka and Urmila Rani??(1997) 
C ee ; Cs i) i Raka and Urmila Rani (unpublished). 


Similarly let r) denotes the kth successive minima for positive values of non- 
homogeneous quadartic form of type (r, 5). In this case following results are known 


PY} = 16  Bambah, Dumir and Hans-Gill!? (1983) 
r@} = 22 — Dumir and Hans-Gill™(1997) 

ry) = 8 — Dumir and Sehmi°6(1992) 

ro, = 227-1 Dumir and Sehmi>’(1992) 

ry} = 8 — Rakaand Urmila Rani!®! (1996) 


Rieger gave a wrong proof of c — 4in 1976. Raka and Urmila Rani?”:!™ (1994) gave 
a correct proof of Rieger’s result and used it to prove that 


(2) 
Peso, 


— qr. 

Raka!9* (1993) and Raka and Rani?’ (1998) have also obtained some results valid for 
zero forms of types (2, 1) and (2, 2). 

The questions about the general behaviour of these sequences are still open. 
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On the Oscillation Theorems of Pringsheim and Landau 


Paul T. Bateman and Harold G. Diamond 


1 Introduction 


Our theme is a relation between the sign of a real function and the analytic behaviour of 
its associated generating function at a special point on the boundary of convergence. The 
central idea is the plausible principle: Let f denote a real valued function with support 
contained in [0, oo) or the nonnegative integers and let f denote the generating function 
associated with f. If f is ultimately of one sign, then extremal behaviour of | f | occurs 
around the real point with largest real part on the boundary of the region of convergence of 
the series or integral that defines a ; 

The main results treated are (1) that analyticity of the generating function at this point 
entails an infinitude of sign changes of f and (2) if f is everywhere nonnegative, then we 
can give upper estimates for | - | in terms of its size near the distinguished point. Each of 
these results has interesting applications in analysis and number theory. 

To fix ideas, we state the first theorem on the subject. Let f(n) denote a real valued 
function defined on the nonnegative integers and let f(z) = )~°2) f(n)z” denote the 
associated generating function, a power series converging for |z| < R for some positive R. 
It is well known that f has at least one singular point on its circle of convergence. 


Theorem 1 (Pringsheim). /f f and a are as above and ra can be continued as an analytic 
function at the positive real point z = R, then the radius of convergence exceeds R or the 
original function f has an infinite number of changes of sign. Equivalently, if f is of one 
sign from some point onward, then the radius of convergence exceeds R or f must have a 
singularity at the point z = R. 


The first statement and proof of the theorem were given in 1894 by Pringsheim [11], 
but the name of Vivanti is sometimes appended to the theorem. A year earlier the latter 
had noted informally and without proof a form of the theorem with the extra condition 
that the coefficients of the power series are rapidly decreasing [15]. In his 1901 book on 
analytic function theory [16], Vivanti himself ascribed Theorem 1 to Pringsheim. This 
result was given also in Hadamard’s 1901 book on analytic functions (p. 21 of [6]), but with 
no attribution. 


Proof: (This is the argument given in [6], [11], and [16].) If we consider the Taylor 
expansion of the function f around the point z = r, where r is some fixed positive number 
less than R, then clearly the radius of convergence p of that Taylor series is at least R — r. 
But if the point z = K is not a singular point, then po > R — r. Now the Taylor series for f 
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around the point z = re!” for any real 6 is majorized by the Taylor series for f around the 
point z = r, since the j-th coefficients of the two Taylor series are 


OO © @) 

> rn(") (rely, pon(" ers 

n=J J n=] J 
respectively. Hence the radius of convergence of the Taylor series for f around the point 
z = re'® is at least p for every real 0. Thus, if z = R is not a singularity of f, then the 
original Taylor series }* f(n)x”" must have radius of convergence at leattp +r>R. OU 


Here are a few simple examples, each with f defined on NU{O} and radius of convergence 

R=, 
A@=(-D", fi =W/d+2), 
fo = 1, fo(z) = 1/0 - 2), 
aa is n C 5 ee | 
f3(n) = 1, n odd, =—1/2 » n even, I= aa 

The first two examples illustrate the two forms of the theorem, while the third example 
shows that the theorem does not have a valid converse. For simplicity, the functions given 


here have simple poles; in fact other types of singularities can occur. For example, the 
generating function of f4(n) = 1, n asquare, f4(n) = 0, otherwise, is 


(oe) 
ye", 
m=0 


a function which is singular at every point of the unit circle. 

It was noted by Landau [9] that an analogue of Pringsheim’s Theorem holds for generalized 
Dirichlet series. This form of the result has turned out to have more applications than the 
power Series version; this is reasonable in view of the non-compact boundary of the region 
of convergence. Here, we shall treat Dirichlet series )>>~_, ann~* with real coefficients or, 
more generally, Mellin transforms F (S)= f - x~* dF (x), where F is a right continuous 
real function supported on [1, 00) which is of bounded variation on each finite interval 
[0, X]. A function F having these properties will be said to be of the class V. A few 
remarks on various transforms are appropriate here. The choice F(x) = )_,<, @n enables 
us to express a Dirichlet series as a Mellin transform, and we will use the same notation 
in both cases. The change of variable u = log x and setting G(u) = F(x) leads to the 
unilateral Laplace-Stieltjes transform ig e *“ dG(u). 

For any of these transforms the open region of convergence, if such exists, is a half 
plane {s € C:: Iis > a} for some real number a, called the abscissa of convergence. The 
transform may converge at all, some, or no points of the line {s : Sts = a}. In contrast to the 
behaviour of power series on the (compact!) circle of convergence, the analytic function 
associated with a Dirichlet series or Mellin transform may have no singularities on the line 
of convergence. For example, we have 


Yi(-b'n = (1-2! )e(8), Ms > 1, 
n=} 
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where ¢(s) denotes the Riemann zeta function. It is known that f(s) is analytic on 
C — {1}, and has a simple pole at s = 1. Thus (1 — 2!~*)¢(s) is entire, while the 
series )-7-.,(—1)"n“° is divergent at s = 0. 

The analogue of Pringsheim’s theorem for Dirichlet series and Mellin transforms is — 


Theorem 2 (Landau’s Oscillation Theorem). [f )-)° a,n~° is a Dirichlet series (respec- 
tively [ a x~* dF (x) a Mellin transform) that converges for Rs > a and if the associated 
analytic function is regular at the point Xs = a then either the series (transform) converges 
to the left of the point s = a or else the sequence Gy is not ultimately of one sign (F is not 
ultimately monotonic). Equivalently, if a, or dF is of one sign from some point onward, 
then the abscissa of convergence exceeds a or F must have a singularity at the point s = a. 


2 Proof of Landau’s Theorem 


We prove the result for Mellin transforms, assuming that a@ is the abscissa of convergence 
of F (s)=f s x~* dF (x), F is monotonic from some point onward, and that F is analytic 
at a. We shall show that the integral defining F converges to the left of a, contradicting the 
assumption upon @. . 

Suppose WLOG that F ¢ on (xo, 00). We expand F as a Taylor series about some real 
point B > a: 


CO 
F(s) = 9° FY (B)(s — By! /j!. 
0 
For j = 0, 1, 2, ... and any X > xo, we have 


(o-) xX 
(—1)/ FB) = / xP log! x dF (x) > / xP log! x dF (x). 
is [= 


By assumption, F is analytic on {s : a0 > a} U{s : |s — a| < 6} for some 5 > 0. Let 
R > B —a@ bea number such that the disk of center 6 and radius R lies within the domain 
of analyticity of F. Lets be areal number such that B — R <s <a. 

We have 


F(s) > > a ES = sla 7 a P log! x dF(x) = a> ear log! x dF(x), 


since the series converges uniformly for | < x < X by Weierstrass’ M-test. If we sum the 
series, we obtain 


X X 
F(s) > | e(P-s) lox, —P GF (x) = | x *dF(x). 
l- l- 


We see that the last integral is bounded above for all X. Since the integrand is nonnegative 
and F is monotone on (xo, 00), the integral defining F(s) converges at s. This contradicts 
the condition that a is the abscissa of convergence of F. 
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3 Some Applications of Landau’s Theorem in Number Theory 


There are several applications of Landau’s Theorem that are well known, one of which we 
mention here in passing; details can be found in Ch. 5 of [7]. Let w(x) = > log p, where 
the sum extends over all prime powers p® < x. By examining the behaviour of the left 
hand side of the identity 


— (Ss) _ I _ C = fey — x - Cx ds, 
1 


sC(S) s—l s-a@ 


we see that (x) — x — Cx® changes sign infinitely often for any real value of C and any 
value of a < £, where Bf denotes the sup of the real parts of the zeros of the Riemann zeta 
function. Now we turn to a few applications of Landau’s Theorem that may be less familiar. 


Nonvanishing of L Functions 


One of Landau’s first uses of this result was to give (in §2 of [10]) a simple proof that 
L(1, x) 4 O for x any real non principal residue character, where 


L(s,x) = >> x(nyn =] ]U- xp), 
p 


n=l 


with the sum representation valid for Its > 0 and the product valid for Sts > 1. We shall 
not prove just this result, but instead will use Landau’s theorem to establish a more general 
assertion following the lines of [2]. 


Theorem 3 /f x is a residue character and a is a real number (with a 4 O incase x is the 
principal character for some modulus), then L(1 + ia, x) #90. 


Proof: We first remark that if a = O and x is the principal character for some modulus k, 
then L(s, x) has asimple pole ats = 1. Inthe other cases, L(1+ia, x) 1s finite. We assume 
that L(1 + ia, x) = O and apply Landau’s theorem to get a contradiction. By complex 
conjugation L(1 — ia, x) also is zero. We consider the Dirichlet series 5 | a(n)n~* of the 
function f defined for Nts > 1 by 


f(s) = (s)?L(s + ia, x)L(s — ia, X). 


It follows from our assumption of the vanishing of L(1 + ia, x) and L(1 — ia, x) that 
f is regular at s = 1, since the double pole of ¢7 is canceled by these zeros. Then f is 
regular at every point of the positive real axis. We are going to show that a(n) > 0 for all n 
and that the series )| a(n)n~* fails to converge for some positive values of s, a contradiction 
to Landau’s Theorem. 

From the Euler products for ¢ and L it is seen that a is a multiplicative arithmetic function. 
To prove that a(n) > O for all n it thus suffices to establish the relation for each prime power 
value of n. Also, we have from the Euler product 


AP MO ga (1- SY PY 1-2)". as > ov. 
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where e(p) = x(p)p ‘“, a function whose modulus is either 0 or 1. Using the representation 


(1 —z)7! =exp (>: /t] (Iz| <1), 
n=] 


we obtain 


a(p)  a(p”) 7 2+«(p)k + €(p) 
1+ ps + pes 1 omerp(So PHS EO) (*) 


Since 2 + e(p)* + €(p)* = 2+ 2Me(p)* > 0, for each p and k, use of the exponential 
series shows that a is non negative. 
From (x) it follows that a(p) = 2 + €(p) + €(p) => O and that 


a(p?) = (24+ €(p) +€(p))?/2+ (2+ €(p)? + €(p)”)/2 
= 2—e(p)e(p) + (1 +e€(p) +€(p))” 
> 2-le(p) > 1. 


Thus 


oo 2 

i >\- a(p*) 3 I 

a: Ds. 
1=1 p =» p P 

From the divergence of the last sum, we see that )~ a(n) /n'/? diverges. Thus the abscissa 


of convergence of )\ a(n)/n* is positive, contradicting the regularity of f on the positive 
real axis. Thus the assumption that L(1 + ia, x) = 0 has led to a contradiction. C] 


Schnirelmann Density of k-th Power-free Integers 


Let Q, denote the set of k-th power-free integers, i.e. the integers not divisible by the 
k-th power of any prime. The elements of the set Q2 = {1, 2, 3,5, 6, 7, 10, ..:} are called 
square-free integers. Let Q;(x) denote the number of elements of Q,; not exceeding x. We 
have 


Ox(x) = > nd) = aE 5 ed) = p+ 0", 


n<X dk In 


where ¢ again denotes the Riemann zeta function. Thus d(Qx), the asymptotic density 
of Q,, is 1/¢(k). The Schnirelmann density of Q, is, by definition, o(Q;) := inf,x> 
O,(x)/[x]. It is known, e.g., that 0(Q2) = Q2(176)/176 = 0.60227... < 6/n* = 
1/€(2). We have the following comparison of o(Q,;) and d(Qx), a result first proved by 
Stark in [13]. 


Theorem 4 0 (Q;) < d(Qx) forall k > 2. 
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Proof: For Ns > 1 we have 


Oo oe [ = fea) 
(ks) > ed. x" dOp(x)=s eos dx, 


so that 


1 c(s) say _ ia On(x) - VE) 
sles) th) hh oe | 

Now ¢(s)/f(ks) — €(s)/Cf(k) has no singularities on the positive real axis, but it has 
many singularities with real part 1/(2k) (arising from zeros of C(s) with Rs = 1/2). 
Thus the abscissa of convergence of the Mellin transform is at least 1/(2k). It follows 
from Landau’s theorem that Q,(x) — [x]/f(k) changes sign infinitely often. In particular, 


O;x(x)/[x] < 1/f(k) holds for some values of x, and thus we have shown — in one step — 
that o(Q,) < 1/f(k) for each integer k > 2. CJ 


We note that the same argument shows that Q,(x) — [x]/¢(k) + Cx® changes sign 
infinitely often, for any real value of C, if ka is less than the sup of the real parts of the 
zeros p of ¢ for which p/k 1s not a zero of ¢. 


A Power Series Involving the Mébius Function 


It is well known that the Dirichlet series )>° w(n)n~* for the Mobius function is the 
reciprocal of the Riemann zeta function, a representation valid in the half plane {s : Ks > 1}, 
and further that 1 /¢(s) is continuable as a meromorphic function on C. Thus, the divergent 
infinite series } > 4(n) may be considered to have some evanescent connection with 1 /¢(0), 
whose value is known to be —2. The behaviour of the related power series )“7° u(n)r” as 
r — 1— was recently investigated by Delange [3]. Empirical data in [5] strongly suggest 
that the limit of the power series is —2, but this is completely misleading. Landau’s Theorem 
enables us to show that the series 1s in fact unbounded from above and below asr > 1 —. 
It is convenient for our calculations to use e~!/*, x > 0, in place of r. 


Theorem § Let 0 <6 < 1/2 and let C be real. Then 


CO 
C 
SPM > 2 
2 wie" — GTi 


changes sign infinitely often on the interval0 <x < ©. 


Proof: For ts > 1 we have 
(o,@) OO 
I'(s)n° -| us—'e-™ du 7} x Se dy, 
0 0 


and by multiplying by (1), summing, and changing sum and integral we get 


I'(s)/C(s) = [ aoe >. u(nje"!/* dx. 


n=] 
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Similarly, 
fore) ‘©, @) 
r(s)e(s +1-6) = [ go Sn re de 
0 
n=) 


so that 
P(sy(t/e) = exis 1-6) = fae [>> inne Sot ten| dx 
0 n=1 n=] 


holds for any real c. Now the last integral is not of the Mellin form required for Theorem 2, 
because the integration range extends to zero. But, 


f oe one 
gee! Ly inne = Seaton] dx 
n=! n=! 
oO 00 oe) 
H yoo Ly won -_ “Lar tem| du 
1 
n=1 n=) 


iS an entire function of s, and we consider 


P(s){1/¢(s) — cf (s + 1 — @)} — AS) 


00 ore) ore) 
af eos {yo weer me ne tenn dx. 
n=| 


n=] 


A(s): 


The left-hand side here has complex singularities with real part 1/2 but no real singularity 
greater than 0. Thus 


OO OO 
Yo e(nye" =o) pee 
n=] n=! 


changes sign infinitely often for any real c. Since 


(o,@) 
es ae r(e)/a _ ee 
n=1 


as x — o, the theorem follows. OC 


Remark: The above theorem shows that if 0 < 6 < 1/2 then 


(l—r)" 7 (npr 


n=l 


is unbounded from above and below as r > I|—. By a refinement of the above argument, 
Delange has shown that 


Gan) bar 


n=l 


has a positive limit superior and negative limit inferior as r > 1—. 
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4 An L2 Theorem of Wiener and Wintner 


An L? analogue of Landau’s theorem for the transform of a nonnegative function was treated 
by Wiener and Wintner in [17]. For a nonnegative function f and an arbitrary compact set 
E CR, we canestimate f, | f|* in terms of { | f |? taken over a small neighborhood of the 
distinguished point. We begin with a power series version of this result which was proved 
earlier by Erd6s and Fuchs in [4]. 


Theorem 6 Suppose that f(z) = \~r°. anz" denotes a power series that converges for 
|z| < R, and suppose that a, > 0 for all n. Let f-(@) := f(re'%), r < R, and supppose 
thatO < «€ <2. Then 


/ code <[—= +1] | | f,(0)|7 dé. 


me 


Proof: The argument depends on the following inequality. Let g, h denote functions on R 
of period 27 belonging to L?[—z, 2]; suppose they have Fourier series - a(nye'"” and 
Me h(n)ei"? respectively satisfying g(n) > 0, h(n) > 0, n € Z. Then for all real c, 


Lf 2(@)h(e — cao] 


It 
=2n |S > B(n)h(nje"| < 20 > B(n)h(n) = / 2(0)h(0) dé. 
n n Ht 
We choose 
= ° oF . 
SOSOl = De aaa ee ee Ane, 
m.n=0 k=—0oO 
with 
(o,@) 
Ak.r = ) aaa > 0, 
n=0 
and 


€ € sinen/2\2 «14 
h(@) =—+— ————} "" =(1—-|0|/e)*, -a <0 <n. 
) an * Oe Do en/2 ye Seek St aes 


It follows that 
| | fr (@)|?h(@ — c) do < | | f-(@)|7h(@) do <|/ \f-(O)I° da. 


Now by superposition with c = 0, €,..., [27 /e]e. we obtain the claimed inequality, since 
h(@ — je) +h(O -—(j4+ De) = 1 for je <0 < G+ De, j =0, 1, ...[27/e] — 1, and 
h(@ — [2m/ele) + h(@) > 1 for [27 /ele < 0 < 27. C] 
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Using the completeness of L*, an immediate consequence of the theorem is — 


Corollary 6.1 Let f, and € be as in the theorem, and suppose that the restriction of f, 
to a segment —e€ < 0 < € has a limit in L*[-«,€] asr — R-—. Then fy has a limit in 
L*{—x, 7] asr — R-. 


By partial summation and a simple estimate we can get the following extension of the 
theorem. 


Corollary 6.2 Let f,, R, and € be as in the theorem, but in place of the condition that apy 
be nonnegative, suppose that a)0 < R < 1 and b) all Cesaro sums of ay, of some fixed 
order are nonnegative. Then 


/ | fr (0)|° do << | lf (0)|? do. 


—It — 


(The condition 0 < R < 1 was not stated in [17], but the result is not valid if R > 1, as 
the example a, = (—1)” shows.) 
For the Mellin version of Wiener and Wintner’s results, let F(s) = f a x ‘dF(x), 


where F € V. We write s = o + it with o and t real and let o.(F) denote the abscissa of 
convergence of F. 


Theorem 7 Let F € VY be monotone increasing, 0 < € < 1, b> €, d be any real number, 
anda > 0,(F). Then 


| AED: .. | | Gs 
— \F(o +it)|? dt <2(5-+5) | \F(o + it)|* dt. 
2b d—b = 2€ 2b —¢ 


Proof: This time we consider functions A(t) := (1 — |t|/e)*, with Fourier transform 


i 00 | wexj/ox2 
hex) = | nine di = «(= ") > 0, 


ca x2 
and 
ey = leo tine = | f ORI" AF dF) 
= [- e'™ dG(u), 
where 


G(u) = | [ oo dF(x)dF(y), Tw) :={@,y):x,y21-, y < e*x}. 
(u 


Thus g is the Fourier transform of a positive measure. 
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For any real c we have 


c+eE a (o.@) 
| \F(o + inP( — eel) dt = / g(thh(t —c)dt 
c € ere) 


" | | e7it AG(w)h(t — c) dt = | i(uye i" dG(u) 


< | iu) aG(u) = f \F(o +inP(i-t)ar< [ \F(o +it)|* dt. 


_ —€ 


The result follows by taking c = d —b+ je with j = 0, 1,...[2b/e]+ 1 and adding the 
results. CL 


As before, we have some corollaries. For the first of them, we define repeated integrals 
of F € V by setting Fo(x) = F(x) and forn > 1 let F,(x) = fy Fn—i(t) dt. 


Corollary 7.1 Let F, b, € be as in the theorem. Suppose that o-(F) > 0 and, for some 
positive integer N, we have Fy > O. If 
lim F(o +it) € L7[-e, €], 
O->O¢+ 
it follows that 
lim F(o + it) € L*[—, bd}. 
O->O¢+ 


The following estimate is useful in proving Halasz’s theorem on mean values of multi- 
plicative functions. 


Corollary 7.2 Let F be as in the theorem with o, = a and suppose that F(s) < I/|s —a| 
holds in a region {s :o0 >a, |s —a| < €}. Then 


[- |F(o +it)|? a «K 
O l SS RE 
= o2 + 72 o—a 


(o,@) 


5 Quantitative Versions of Landau’s Oscillation Theorem 


Landau’s Oscillation Theorem is a relatively straightforward analytic tool and does not 
always give the best possible results. The theorem has several quantitative extensions, of 
which the Wiener-Wintner theorems are examples. Here we discuss a few more results of 
this general type. The first of these estimates the oscillation of a function in terms of the 
principal part of the generating function at a pole on the abscissa of convergence. This is 
an idea which goes back at least as far as Erhard Schmidt’s paper [12]. 


Theorem 8 Suppose that F is in the function class V (defined in §1), F exists and has a 
continuation as a meromorphic function to some domain containing {s : Ns > B} witha 
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pole of orderm > \ at some point B+iy with B > QOandy > 0. Assume that the principal 
part there is 
—m 


ea P iy’. 


j=-l 
Finally, suppose that there exists no singularity of F on the real segment [B, oo). Then 


: F(x) Ic oe F(x) —|c— mn 
im sup ————— > ————_—.,__ liminf —-——_ < ——————-"—__-. 
roo XP(logxy"—! — |B+iylP(m)’ x00 xPlogxy™—! = |B +iy| Fm) 


Proof: It suffices to show the second relation; the first follows upon replacing F by —F. 
Suppose that c is a positive number such that F(x) + cx? (log x)”—! /I'(m) is positive for 
all sufficiently large values of x. If no such c exists, then F(x)/(x? (log x)y"—!) has limit 
inferior —oo and there is nothing further to show. 

Foro > B we have 


HOW [ “| F¢ eee aa a 
——— x x x — 
; ) @npe dh rim). 


or briefly g(s) = fx‘ f(x)dx. This representation is valid for all o > 6 by Landau’s 
Theorem. Since f is assumed positive for all x > X, for some X, we have 


»¢ oe) 
esl < f s7feids + | x7? f(x) dx 


X 


»¢ 
< / x FG) — FW) de + 2(6) = e(0) + O01). 


Now take s = 0 + iy and leto — B+. We have 


{1 + 0(1)}|c—m| ; C 

ee ee (PO Ly) = BO) + OU) = OD): 
(o — By” |B+iy| io= py" 

The last inequality implies that c > |c_m|/|B +iy|. Thus, if we choose c’ < |c_m|/|B+ 


iy|, then we must have F(x) + c'x® (log xy"-!/ Tm) < 0 for a sequence of arbitrarily 
large values of x. C) 


If B denotes the supremum of the real parts of the zeros of the Riemann zeta function and 
if the supremum is actually attained, then the preceding theorem gives Schmidt’s result that 


lim sup(W (x) — x)/x? > 0, lim inf (y(x) — x)/x? <0. 


Theorem 8 also yields the result of Delange quoted at the end of Section 3. 

If one is in possession of further information about the generating function, one can use 
more sophisticated methods. For example, Ingham established in [8] quantitative oscillation 
estimates using many singularities of the generating function on the abscissa of convergence. 
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His method was based on an identity occurring in the most familiar proof of the Wiener- 
Ikehara tauberian theorem; an excellent discussion of this method is given in [1]. 

Finally, for F absolutely continuous with a piecewise continuous derivative @, we mention 
the problem of estimating the number of sign changes of w. If the generating function 


Fs) = f x *w(x) dx 
I 


is regular at the real point on the abscissa of convergence, then Laudau’s theorem guarantees 
that w changes sign infinitely often. A theorem of Steinig [14] provides a lower bound on 
the number of changes of sign of w in any interval [1, X] under the assumption that the 
generating function F has a continuation as a meromorphic function to an open half plane 
that properly includes the half plane of convergence. 
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Modular Equations in Ramanujan’s Lost Notebook 
Bruce C. Berndt 


0 Introduction 


Ramanujan recorded several hundred modular equations in his three notebooks [7]; no other 
mathematician has ever discovered nearly so many. Complete proofs for all the modular 
equations in Ramanujan’s three notebooks can be found in Berndt’s books [1]—{3]. In parti- 
cular, Chapters 19-21 in Ramanujan’s second notebook are almost exclusively devoted to 
modular equations. Ramanujan used modular equations to evaluate class invariants, certain 
q-continued fractions including the Rogers-Ramanuyjan continued fraction, theta-functions, 
and certain other quotients and products of theta-functions and eta-functions [3]. 

In his lost notebook, and in a few fragments published with the lost notebook [9], 
Ramanujan organized some of his modular equations by type, rather than by degree as 
he did in his second notebook. These lists cover the most important kinds of modular 
equations. Although many of these modular equations are found in his notebooks [7], some 
are not. The purpose of this paper to is provide a list and discussion of all these modular 
equations and to give proofs for those not found elsewhere in Ramanujan’s notebooks. 

Each modular equation is equivalent to a certain theta-function identity, but a theta- 
function identity may not have an equivalent modular equation. Ramanujan’s lost notebook 
contains many new and beautiful theta-function identities which we will not discuss here. 
These are being proved by the author and/or his past and current Ph.D. students in a long 
series of papers currently being written. These students include H.H. Chan, Y.-S. Choi, 
S.-S. Huang, S.-Y. Kang, W.-C. Liaw, J. Sohn, S.H. Son, and L.-C. Zhang. 

In the first section, we examine the modular equations on page 55 of the lost notebook. 
These have been called P—Q modular equations (Part IV [2, p. 204]), or eta-function 
identities, or Schlafli-type modular equations [6]. These are among the most elegant and 
beautiful modular equations found by Ramanujan and they have been most useful in the 
applications mentioned above. 

In Section 2, we examine a fragment on pages 350-352 of [9] containing six groups 
of modular equations. These include modular equations associated with the names of 
A.M. Legendre, H. Schroter, and R. Russell. The last of the six sets contains Ramanujan’s 
beautiful formulas for multipliers. 

The brief Section 3 is devoted to a fragment found on page 349 of [9]. 

Before proceeding further, we provide some definitions in preparation for defining a 
modular equation, as Ramanujan would have understood it. 
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The complete elliptic integral of the first kind associated with the modulus k,0 <k <1, 


is defined by 
K := K(k) =[" 
1— Vitae sin” 


The complementary modulus k’ is defined by k’ = J/1 —k?; set K’ = K(k’). If qg = 
exp (—2 K’/K), then one of the central theorems in the theory of elliptic functions 
asserts that 

7g) = — KW) = 2Fi (gb), 0.1) 


where @ denotes the classical theta-function defined by 


e@= >< a, 


k=—o0o 


2F\ G, 53 1; k*) denotes the ordinary hypergeometric function, and where the last equality 
in (0.1) follows from expanding the integrand in a binomial series and integrating termwise. 
It is (0.1) upon which all of Ramanujan’s modular equations ultimately rests. 

Let K, K’, L, and L’ denote complete elliptic integrals of the first kind associated with 
the moduli k, k’, £2, and £’ := V1 — @?, respectively, where 0 < k, 2 < 1. Suppose that 


ra—_ = — (0.2) 


for some positive integer n. A relation between k and @ induced by (0.2) is called a modular 
equation of degree n. In fact, modular equations are always algebraic equations. After 
Ramanujan, seta = k? and B = €”. In the sequel, we shall frequently say that 6 has degree 
n over a. Lastly, the multiplier m is defined by 


mim=—. 


L 


At the end of Section 2, we shall state several formulas for multipliers in terms of a and B. 
These can be regarded as transformations of elliptic integrals, or, by (0.1), transformations 
for hypergeometric functions. 


1 Eta-function Identities 


After Ramanujan, define 


f(—q) = q7/4n@z) =JJa-a" = 3 (—1)FgkO DI? q = exp(2ziz), 


n=1 k=—00o 


where n(z) denotes the Dedekind eta-function, |qg| < 1, and the last equality is Euler’s 
pentagonal number theorem. 
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There are four sets of modular equations on page 55 of [9]. For the first set, Ramanujan 
puts 


= __ yh 
em me cere and jee ee, (1.1) 
q\/° f(—q?) qh’ F(—qrn) 
Five identities comprise the first set. 
Entry 1 Forn = 2 in (1.1), the functions u and v satisfy the modular equation of degree 5, 
5 u\3 v\3 
we Z= (+). 
Uv v u 
Entry 1 is identical to Entry 53 in Chapter 25 of Part IV [2, p. 206]. 


Entry 2 [fn = 3 in (1.1), then the functions u and v satisfy the modular equation of 


degree 15, 
5 U6 v\ 6 u\3 v\3 
3 Ea, epeeeeten |e (eek ame Ge ie i e 
w'+(S) {0-0} (0+) 
Entry 2 is the same as Entry 63 of Chapter 25 of Part IV [2, p. 223]. 


Entry 3 /fn = 4 in (1.1), then the functions u and v satisfy the modular equation of 


degree 5, 
wr (SZ) = CY +Q) (+0) 
lage) aa lGaea | (3.1) 


Proof: Let 6 have degree 5 over a, and let m denote the multiplier of degree 5. From 
Entries 12(1i) and 12(iv) in Chapter 17 (Part HI [1, p. 124]), 


eg Vl Fat 24 ee ee re 
w= mi(G=5) (G) mt ve vm(i=s) (G) 2 


respectively. Recall the definition (Part III [1, p. 284, eq. (13.3)]) 


0 = (m> — 2m? + 5m)! (3.3) 
and the representations (Part HII [1, p. 286, eq. (13.12)]) 


1/4 oe 1/4 = 
(5) _2m+p (; 5) ioe (3.4) 
B m(m — 1) 1—B 


Thus, from (3.2) and (3.4), 


“= (7-4 “(e ae 2m—p {m(m—1)_ [2m—p 3.5) 
vo —) = —Vm(m—D)Y 2m+p  Y2%m+p ° 
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Hence, after a modicum of elementary algebra, 


cy cle ney ee Lg (3.6) 
Vv u 
Next, by (3.2) and (3.4), 
ans 7 - (; _ “\" [sy" 7 (4m? _ p*)/? 
AEB) ABP mm = 15” 


: 5\> (4m? — p2)5/2_—-125m?(m — 1)° 
Ci AN re ee 
m2(m st 1)> (4m? a p*)>/4 


and so 


(3.7) 
uv 


Hence, by (3.6), (3.7), and (3.3), 


(+2) (w"+(Z)) 


e “(et —p2)? — 125m2(m — | 


m*(m — 1)° (4m? — p*)3 


. 


(—m3 + 6m2 — 5m)? + 125m4(m — 1)! 
m2(m — 1)>(—m3 + 6m2 — 5m)3 


(m — 1)?(m — 5)? + 125(m — 1)!°/m 
Bs ig A cr ce ners Seiad oes 
(m — 1)8(m — 5)3 


(m — 5)° — 125(m — 1)°/m 
— (m—1I3(m—5)3 a (3.8) 


Next, by (3.5) and (3.3), 
felt a ee 
v u ~~ \Imt+ p 2m — p 
16m3+12mp2 — 4m(3m> — 2m? + 15m) 


(4m? = p*)3/2 (—m> aa 6m2 — 5m)3/2 (3.9) 


Hence, combining (3.6) and (3.9), with the use of (3.3), we find that 


uo u 3 v\3\ 16(3m? — 2m? + 15m) 
ats) (5) +(z) ) = Cm + 6m — 5)? a 
Next, by (3.5) and (3.3), 


(“ 5 v\5 64m> + 160m? p* + 20mp* 
=? > (-) (4m2 — p2)5/2 
_ m>(20m* + 80m? + 24m? + 400m + 500) 


3.11 
(—m3 + 6m2 — 5m)°/2 ean 
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Hence, (3.6) and (3.11) yield 


(“ . ((“) n (*)’) _ 4m(20m* + 80m? + 24m? + 400m + 500) Aa) 


rar u (—m2 + 6m — 5)3 


Hence, multiplying (3.1) by u/v + v/u, we find that, by (3.6), (3.10), and (3.12), the new 
right side of (3.1) can be written in the form 


16m(5m* + 20m? + 6m? + 100m 4+ 125) 
(—m2 + 6m — 5)3 
128m(3m> — 2m? + 15m) 64m 


——.——————_ + 4. 3.13 
(—m? + 6m — 5)? ee rar Oe aa oo) 
In view of (3.1), combining (3.8) and (3.13), we find that it suffices to prove that 


m(m — 5)? — 125(m — 1)? = —4m(5m* + 20m? + 6m? + 100m 4+ 125) 
—32m(3m* — 2m + 15)(m2 — 6m +5) 
—16m(m* — 6m + 5)* + (m? — 6m +5). 


This last equality is easily verified via Mathematica, and this completes the proof. C 


Entry 4 [fn = 5 in (1.1), then the functions u and v satisfy the modular equation of 


degree 25, 
(uv)? + (=) +5 (w» a =.) po 6) | (4.1) 
uv Uv u 
Proof: In Part III [1, p. 268, eq. (11.8)], we proved that 
6. _f°(-9) 


vO I= @ f9(—q25) = (uv) i 5(uv)* + 15(uv)? te 25(uv)* + 25uv, 
q —q 


where we have replaced q by q? in the cited formulation. Dividing the equality above by 
(uv)° and rearranging the terms, we easily deduce (4.1). C 


Entry 5 [fn =7 in (1.1), the functions u and v obey the modular equation of degree 35, 
52 SY u\4 v4 u\3 v3 
wr+(5) =-{O'-@}-10'+@)] 
uv v u v u 
U2 v\2 uv 
-1)(=) -(-) }+14(=+-). (5.1) 

v u vou 
Proof: We will use the theory of modular forms and employ the theory developed by the 
author and L.-C. Zhang in Part IV [2, pp. 237-239]. 


Let g = exp(27iz), where Imz > 0, and recall that f(—q) = q 
denotes the Dedekind eta-function. In the notation of Part IV [2, p. 237], 


—1/24 7 (2), where n 


uv = R57(Z) and =o v/u = S$5.7(2). 
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By Lemmas 68.1 and 68.2 in Part [V [2, pp. 237, 238], we deduce that 
R23 5(z), 85,7(2) € {1(35), 0, 1}, 


where {Io(n), 0, 1} is the space of modular forms on Jo(n) of weight O and multiplier 
system identically equal to 1. 
From Part IV [2, p. 239], if r/s denotes a cusp with (r,s) = 1, then, for any pair of 
positive integers m,n, 
r (mn, s)? 
d ( , 7) = ’ 
oer INN) ) 24mn 


where (a, b) denotes the greatest common divisor of a and b. A complete set of inequiv- 
alent cusps for [9(35) is {0, 0, 5, at Using (5.2) repeatedly, we compose the following 
table summarizing the information that we need about the orders of certain functions at 
these cusps. We have abbreviated the left and right sides of (5.1) by L(5.1) and R(5.1), 


respectively. 


(5.2) 


cusp/order v uv u/v L(5.1) R(5.1) 
0 as ails ss a ee ia io 
30 210 105 35 35 35 
1 ey geek) sae ol a if 
5 6 42 21 7 7 fl! 
1 J aa a soul at a4 
7 30 30 15 5 5 5 


If F(z) denotes the difference of the left and right sides of (5.1), and if 5° c denotes the 
sum over a complete set of inequivalent cusps for /9(35), then, by the valence formula 
(Part IV [2, p. 239]), 


O= > ord(F; C) > ord(F; ow) — + — s — : = ord(F; oo) — 32 (5.3) 
4 
Thus, if we can show that F(z) = O(q*) as gq — 0(z — ioc), then we will have obtained 


a contradiction to (5.3), unless F(z) = 0, which is what we want to prove. In fact, using 
Mathematica, we find that 


I 3 5 7 
LU gee eg l6g* +--+: = R(1.4). 
q q- q 
This then completes the proof of (5.1). CJ 


In the second set of eta-function identities, Ramanujan sets 


—_g l/s _An/5 
WG — ie aes and y — A se (6.1) 


~ g'/5 f(—@5) qn!> f (—q>") 


It is not clear why Ramanujan did not write u and v for u* and v’, respectively. There are 
just two modular equations in the second set. 
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Entry 6 Forn = 2 in (6.1), the functions uand v satisfy the modular equation of degree 25, 


5 u3 v\3 ue Uv 
yin ey afer lect, 62) 
uv v u vou 
Proof: We rewrite Entry 58 of Chapter 25 in Part IV [2, pp. 212, 213] in Ramanujan’s 
notation (6.1). Thus, since P = u* and Q = v”, we find that 


25 6 6 4 4 
wv? +5 = (=) + (=) -44 (=) +(=) F- (6.3) 
uv v u v u 
In (6.2), observe that, for sufficiently small and positive g, each side is positive. It thus suf- 


fices to show that the squares of both sides of (6.2) are equal, i.e., after slight simplification, 
we want to prove that 


5 ee. 25 u\6 v\6 ur 2 V\2 
oe = (Sef +O 
uv v u v u 
uoesoU u\3 v\3 
-4(2+2){(2)'+(2)']} (6.4) 
In comparing (6.3) with (6.4), we see that it remains to prove that 


-(- =O +-ED{ +O} 


Since the last equality is trivial, the proof is complete. C) 


Entry 7 [fn = 3 in (6.1), the functions u and v satisfy the modular equation of degree 75, 
25 5 
wv? + 5 +3(=+-) w+ —) 
urv2 vou uv 


2 2 
= (-) +(-) ~6(=+—)-9, (7.1) 
v u vo 
Proof: From Schoeneberg’s book [10, p. 102], if 04, denotes the number of inequivalent 
cusps of [o(N), then 
To =) v((d,N/d)), 


d|N 


where g denotes Euler’s g-function, and (a, b) denotes the greatest common divisor of a 


and b. If N = 75, then 0, = 12, and a complete set of inequivalent cusps is given by 


Pole gk a A eh ek 
{0. 00, 3,5, 19> 15> 30° 35° 30° 45° 50° GOI 


Set U(q) = u(q>) and V(q) = v(q°). In the notation of Part IV [2, pp. 237, 238], 
U?V* = Ros3(z) and = V*/U* = S25,3(z), 
where g = exp(271z). By Lemmas 68.1 and 68.2 in Part IV [2, pp. 237, 238], 


Ry5,3(Z), $25,3(z) € {10(75, 0, I}. 


62 Bruce C. Berndt 


Using (5.2), we compose the following table for orders of cusps. 


cusp/order uu? v? uv u/v L(7.1) R(7.1) 
0 oe 
25 75 75 75 75 75 
i de oe ee, ee ee 
3 25 25 25 25 25 25 
S 0 0 0 O 0 0 
b 0 0 O 0 0 0 
* 0 0 O 0 0 0 
1 
+5 0 0 oO 0 0 0 
af a a ae ee ae 
i 3 3 3 3 3 
30 0 0 oO 0 0 0 
+ 0 0 oO 0 0 0 
al a. esl, ee: ee a 
3 3 3 3 3 
35 0 0 oO 0 0 0 


If F (z) denotes the difference of the left and right sides of (7.1), andif } |, denotes the sum 
over acomplete set of inequivalent cusps, then, by the valence formula and the tables above, 


O=) ord(F; £) > ord(F; 00) — %& — #9 — F — F = —F8. (7.2) 
¢ 
Thus, if we can show that F(z) = O(q?) as q tends to 0, or z tends to ioo, then we will 


have shown a contradiction to (7.2) unless F(z) = 0, which is what we want to prove. In 
fact, using Mathematica, we find that 


l 2 1 2 
q q q q 
This then completes the proof. O 


The third set of eta-function identities comprises five modular equations. For these, 
Ramanyyjan sets 


f(—q) f(-@°) 
ae q—D/24 F(—g") and ee gitn—D)/24 F(— ge") (8.1) 


Entry 8 Forn = 2 in (8.1), the functions u and v satisfy the modular equation of degree 5, 


2\2 3 3 
(uv)? + (=) = (—) = (=) | (8.2) 
uv u v 
Proof: We will prove that (8.2) is equivalent to Entry 13(xiv) in Chapter 19 [1, p. 282]. To 
that end, first set 


—  £@ si ies f(q) 
~ gis f(—q2)  g°/4 f(—q!0) 
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If we replace g by —g in (8.2), we then find that (8.2) is equivalent to the identity 


an a ae (2) 
(UV) (zz) = (7) eae (8.3) 


We now apply Entries 12(i) and 12(iii) in Chapter 17 [1, p. 124] to deduce that 


1/24 
UV =2'B{ap —a)(1 — B)}- 4 and 77 (5-3 S) | 
V a(l— a) 


Thus, (8.3) is equivalent to the modular equation of degree 5, 


2 apct — a(t — 8)" 
{aBO — a)(1 — p)}t/2 
a(l—a)\' (a= a8)" 
= | ———_—_ : 8.4 
Ge 7 eda) we 


But with 


pu — ery" 


a(1— a) 


P :={l6aB(1—a)(1—p)}'/"* and = Q:= (o=2 


(8.4) may be rewritten in the form 


Z l 
SDP ip. 
- 0 + @Q 
which is Entry 13(xiv) in Chapter 19 [1, p. 282]. CO 


Entry 9 Withn = 3in(8.1), the functions u and v satisfy the modular equation of degree 15, 


2 3 v\3 u\3 
(uv)? + &. +5= (-) = (=) | 
Uv u v 
Entry 9 is identical to Entry 62 in Chapter 25 [2, p. 221]. 


Entry 10 With n = 4 in (8.1), the functions u and v satisfy the modular equation of 


degree 5, : 
w+ (SP aC +(P-aG+s) aan 


Proof: By Entries 12(ii) and 12(iv) in Chapter 17 [1, p. 124], 


and 


a gS -_ 1/8 
y= ae = vi(- *) | 
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where B has degree 5 over q. It follows that 
i = 1/8 is 1/8 
ww =2 (T=) ig Coes eee) (10.2) 
ap v a(1 — B) 
Thus, using (10.2), we see that, to prove (10.1), it suffices to prove the fifth degree modular 


equation 
os = 1/4 1/4 = 3/8 
a(S a) (I e) +4( arp ) =(= >) 
ap (1 —a@)(1 — B) BCL — @) 


3/8 1/8 _ 1/8 
Ce ») - pe p) = a) | id) 
a(l — B) BCI — a) a(l — B) 
Recall that p is defined by (3.3). From Part III [1, pp. 285, 286, eqs. (13.10), (13.11)], 
we find that 


(S — a)(1 — ay" = (2 — 3m +5)(p — m? + 3m)\'" 
ap ~~ \(p + 3m —5)(p + m2 — 3m) 
where m is the multiplier of degree 5. Also, from Part III [1, p. 286, eq. (13.12)], 
& =f) 1/8 (2m +p 1/2 
B(1 — @) — \Qnm-p}y - 


Thus, (10.3) may be recast in the form 


(p + 3m — 5)(p + m* — 3m) (p — 3m + 5)(p — m2 + 3m) 


2m + p ai 2m — p ote 2m+ p be 2m — p ae 
em + —5 — 5 r) 
2m — 0 2m + p 2m — pO 2m + p 


A{(p —3m+5)(p — m? + 3m) + (p + 3m — 5)(p + m* — 3m)} 
(pom) pone Sm) ie 
_ Gm + p)? + (2m — py 20m 
(4m2 — p2)3/2 ~ (4m2 — p2)1/2° 


Or 


(10.4) 


Expanding all the numerators above, employing (3.3), putting the right side under one 
denominator, and omitting a goodly amount of elementary algebra, we find that (10.4) 
reduces to the equation 


l 1 
(m3 — 11m? + 35m — 25)!/2(—m3 + Tm2 — im +5)'/2 (—m? + 6m — 5)3/2 
(10.5) 


However, (10.5) is easily shown by factoring all the polynomials in it, and so this completes 
the proof. CO 
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Entry 11 With n = 5, the functions u and v in (8.1) satisfy the modular equation of 


degree 25, 
nf SNE 5 v\3 
vy +(—) 4+5(u4+— +15=(-) . 
uv UU u 


Entry 11 is identical to Entry 4 above. 


Entry 12 With n = 7, the functions u and v in (8.1) satisfy the modular equation of 


degree 35, 
9) 7 v\3 u\3 v\2 u\2 
wots (Z)-5=(2)'-(@)'-5{()' +] 
Uv u v u v 
Entry 12 is the same as Entry 71 in Chapter 25 [2, p. 236]. 
The fourth and last set of eta-function identities contains two modular equations featuring 


f(-q") f(—q) 
al gO—)/24 F(—@>) and i gq (Sn—1)/24 Ff (—g>”) ° (13.1) 


Entry 13 With n = 2 in (13.1), the two functions u and v satisfy the modular equation of 


degree 5, 
5 ¢ 2 (=) 
uv-—-=(-) -(=). 
Uv u v 
Entry 13 is identical to Entry 54 of Chapter 25 [2, p. 207]. 


Entry 14 With n = 3 in (13.1), the functions u and v satisfy the modular equation of 


degree 15, 
5 \° v\4 3u\4 v\2 a \" 
Seca ef ec fe ae ae iesaced 
y) (=) &, (=) +(<) ( = | 
Entry 14 1s identical to Entry 64 in Chapter 25 [2, p. 226]. 


For another approach, using the Atkin-Lehner involution, to deriving modular equations 
of the type considered in this section, see a paper by H.H. Chan and M.L. Lang [4]. 


2 Summary of Modular Equations of Six Kinds 


We reproduce Ramanujan’s summary of several of his modular equations found in a frag- 
ment on pages 350-352 of [9]. The modular equations are grouped into six types. It is 
interesting that, in contrast to his work in the notebooks [7] and lost notebook [9], Ramanujan 
used the more standard notations of k and @ to denote the moduli. Since most of the modu- 
lar equations in this section have been given elsewhere by Ramanujan, so that readers may 
more easily compare the results stated here with Ramanujan’s other work, we have put all 
the modular equations in Ramanujan’s original notation. 
There are three modular equations in the first set. 
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Entry 15 /f B has degree 2 over a, then 
(1—V1—a)(l — /p) = 2,/B0 — a). 
Entry 16 /f B has degree 4 over a, then 
(1-71 —a)(l — YB) = 2Wp( —a). 
Entry 17 If B has degree 8 over a, then 
(l— V1 —a)(1 — YB) = 2¥/2B(1 — a). 


After some elementary algebraic manipulation, it is easily seen that Entry 15 is equivalent 
to part of equation (24.12) in Chapter 18 of Part HI [1, p. 213] and that Entry 16 is equivalent 
to (24.22) in Chapter 18 [1, p. 215]. Entry 17 is the equation just before Entry 24(vi) in 
Part II [1, p. 217]. Unfortunately, we erroneously claimed [1, pp. 216, 217] that two of 
Ramanujan’s modular equations with degrees 8 and 16 are incorrect. It was the author, 
not Ramanujan, who was incorrect, and our work was corrected in Part V [3]. Modular 
equations of degree 2” can be obtained from classical theta-function identities by iterating 
modular equations of degree 2”—'!. However, the complexity of these modular equations 
increases rapidly with n. 

There are three sets of modular equations in the second and third groups as well. 


Entry 18 /fm denotes the multiplier of degree 2 and B has degree 2 over a, then 


» Il+J/JB — I+4+8 
— 147i—a t+(l-a) 


Entry 19 /fm denotes the multiplier of degree 4 and B has degree 4 over a, then 


1+ YB  14+/8 


Entry 20 /fm denotes the multiplier of degree 16 and B has degree 16 over a, then 


1+ VB _ 
Ey eee 


The two equalities of Entry 18 are given in (24.17), as part of Entry 24(ii) in Chapter 18 
[1, p. 214]. The two equalities of Entry 19 are given in (24.20), as part of Entry 24(i11) in 
Chapter 18 [1, p. 215]. Lastly, Entry 20 is given in the middle of page 216 of [1] and is 
part of Entry 24(iv) of Chapter 18. 


ii = 


Entry 21 /fm is the multiplier of degree 2 and B has degree 2 over a, then 


(a) m/l —at+ SB = 1, 
(b) 2/68 + JI —a@ =1, 
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(c) m?/1—a+p=1, 


and 


(d) VB + (1—a) =1. 


Parts (a) and (c) are parts of Entry 24(i1) in Chapter 18 [1, p. 214, eqs. (24.15), (24.16)]. 
The equation in part (b) is the reciprocal of that of (a), and the equation in part (d) is the 
reciprocal of that of (c). (For the definition of the reciprocal of a modular equation, see 
Part III [1, p. 216, Entry 24(v)].) 


Entry 22 /fm is the multiplier of degree 4 and B has degree 4 over a, then 
(a) /mY1 —a + VB = 1, 
(b) VB+ V1 -a=1, 
(c) mY¥1—a+/p=1, 
and 
d) 4Y¥B+J1—a=1. 


Parts (a) and (c) are parts of Entry 24(iii) in Chapter 18 [1, pp. 214, 215, eqs. (24.18), 
(24.19)]. The modular equations in parts (b) and (d) are the reciprocals of those in parts (a) 
and (c), respectively. 


Entry 23 [fm is the multiplier of degree 8 and B has degree 8 over a, then 
(a) /m/1 —a+ YB =1, 


and 


(b) of? p+ Ta = 1. 


Part (a) is the same as equation (24.24) on page 216 of Part III [1], while the equation of 
part (b) is the reciprocal of part (a). 

Ramanujan records four modular equations in his fourth set. The first, due to Legendre, 
is historically the first modular equation of a degree which is not a power of two. As we 
emphasized in Part III [1, Chaps. 19, 20], H. Schroter derived several modular equations 
of this sort. Many modular equations of this kind also were derived by R. Russell, but his 
methods are not completely rigorous. Russell’s method has been put on a firm foundation 
by H.H. Chan and W.-C. Liaw [5]. 


Entry 24 /f B has degree 3 overa, then 
(apy! + ((1 — a) (1 — py} = 1. 
Entry 24 is also Entry S(ii) of Chapter 19 [1, p. 230]. 
Entry 25 /f B has degree 7 over a, then 


{ap}'/* + {(1 —a@)(1 — py}! = 1. 
Entry 25 is identical to part of Entry 19(i) of Chapter 19 [1, p. 314]. 
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Entry 26 /f B has degree 15 over a, then 


(apy'"6 (1 + Yay + VBS + ( — vaya — BDI") 
+ {1 —ay(1 — py} "6 (+ VT ay + JT By} 
+ (—Vi—ayc — 1 By") = V2. 
Entry 26 is identical to Entry 20(vi) in Chapter 20 [1, p. 384]. 


Entry 27 /f 6 has degree 31 overa, then 


ill (\ + Joy + VB) 1 + {ap}!/4 + (C1 — Va) — /B)} 4 
+d = Ya) VEY + (apy + (1+ Yaycl + VDI") 


+ {Ud —a)(1 — p)}'/? (ta +J1—a)1+/1—p)}8 


x Yl +{(— al — 6)" + ( - V1 el — Y1 — £4 


Hid=/l-ed=./1 =p) 


xYlt{d—a@)( —p)}'44 (14+ V1 a(t Vi= A's) a2 


Entry 27 is the same as Entry 22(i) in Chapter 20 [1, p. 439]. 
The fifth set in this fragment contains seven results. These results are similar to modular 
equations of Russell-type. However, Ramanujan focuses on the algebraic expression, 


1+ Jap + JC —a@)(1 — B) 
—— ee ; 


which has not been prominent in the work of any other mathematician on modular equa- 
tions. In the next section, we will see how useful this expression becomes in simplifying 
modular equations. 


Entry 28 /f B has degree 7 over a, then 


JU —a)(1 — B) 
[vee VO OCP = tap + (1 anc — py) 


— {aB(1 — a) — p}'*. 


By combining both parts of Entry 19(i) in Chapter 19 [2, p. 314], we easily deduce 
Entry 28. 

Ramanyjan claimed that the modular equation of Entry 28 also holds for n = 2. At first, 
this is somewhat puzzling, since modular equations have been defined for only integral n. 
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However, Ramanujan evidently had in mind modular equations of degree 15, where, in 
general, there are four moduli of degrees 1, 3, 5, and 15. Thus, Ramanujan asserted that 
the moduli of degrees 3 and 5 satisfy the modular equation above. In the proof below, we 
depend heavily on the parametrizations used in Part III [1, pp. 385-387] to establish many 
modular equations of degree 15. 


Entry 29 /fa and B have degrees 3 and 5, respectively, then 


JA —a)i — B) 
pe Aa ia = {ap}'* + (11 —a)(1 — py} 


— {aB(1 —a)(1 — py}. (29.1) 


Proof: Using the notation in Part HI [1, p. 385], but with 8 and y there replaced by a and 
B here, we set 


B={ap}'® and B ={(1—a)(1—p)}!*. 
Thus, (29.1) takes the form 
1+ B4+ B” 
ie eee _ = B+ B’— BB’. (29.2) 
Also set [1, p. 386] 
B=3(M—p) and B’=35(M +p), (29.3) 
where 
zs Z1215 
2325 


and where [1, eq. (11.4), p. 386], 


p= (29.4) 


Also [1, middle of p. 387], 


1+B4+B4% 14M+4+3M2 —MmM? 
Bee ee ee, (29.5) 
p) 4M 


On the other hand, by (29.3) and (29.4), 


B+ B’—BB' = M—}(M* — p’) 


1+M—WM?2 

= M—in? eee ae 
ea 4M 
1+M+3M2—M™M> 


Comparing (29.5) and (29.6), we complete the proof of (29.2) and so also of Entry 29. O 
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Entry 30 /f B has degree 15 over a, then 


= {ap}'/® + {1 —@) cl — B)}'® 


+ {aB(1 —a)(1 — p)}}/8. (30.1) 


[stveb+ VOX aHT=D 
2 


Proof: The proof is similar to that above, and again we rely heavily on the notation and 
calculations from Part III. Set, as in Part III [1, p. 385], but with 5 there replaced by 6 here, 


A={aB}'® and A ={(1—a)(1—f)}!8. 


1-4A* A" 
{> =A+A’+AA’. (30.2) 


A=3(M~'—p) and A’=1}(M~'+p), (30.3) 


Thus, (30.1) takes the form 


With (Part III (1, p. 386]) 


we find that (Part III [1, near bottom of p. 386]) 


1+ A*+A% 143M —M?4 MP 

SS (30.4) 
2 4M 

On the other hand, by using (30.3), (29.4), and a calculation similar to that in (29.6), we 

find that 

14+3M —M?+ mM 


A+ A’+AA’ = 30.5 

+A’ + rive (30.5) 

Comparing (30.4) and (30.5), we see that we have established (30.2), and the proof is 

complete. C) 
Entry 31 Jf 6 has degree 23 over a, then 

1+ /ap + /( — a@)(1 — B) 
jee = 45(1 + {ap}'/" + (C1 —a@)(1 — B)}') 
+ 2! api — a) — py}. (31.1) 


Entry 31 is identical to Entry 15(ii) in Chapter 20 [1, p. 411]. In fact, Ramanujan’s 
formulation in the lost notebook is erroneous, since the last term on the right side of (31.1) 
was replaced by 


27/7 faB(1 — a)(1 — p)}!/°. 
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Entry 32 /f 6 has degree 31 over a, then 


JU —a)(1 — B) 
a = 1+ {ap}'4 + {4 —a)(l — p)}!4 


(ap) P= tea iap)ye 
— {aB(L — a)(1 — py}. 


Entry 32 is the same as Entry 22(ii1) in Chapter 20 [1, p. 439]. 


Entry 33 /f B has degree 47 over a, then 


JU —a)l — B) 
a = 5 (1 at {ap}i/4 eh yl p)}'/4) 


1/24 
+ (sgaB(1 —a)(1—)) (1+ (op) + (C1 = a1 = yy). 


Entry 33 is the same as Entry 23(1) in Chapter 20 [1, p. 444]. 


Entry 34 /f B has degree 71\ over a, then 


JU —a)(l — B) 
a = 1+ {aB}'4 + (1 —a — p)}!4 


lop) = Mop) epee B)y 
+ 27/9 faB(L —a)(1 — B)}!/*4 fap}! + (C1 — a) — p)}'/8 — 1). 


Entry 34 is identical to Entry 23(i1) of Chapter 20 [1, p. 444]. 

The sixth and last group of modular equations in this fragment contains seven pairs of 
formulas for moduli. Formulas of this sort seem to have originated with Ramanujan, and 
Ramanujan’s methods for deriving these equations are unknown. From the definition of 
a modular equation, a formula for a modulus yields a transformation between two hyper- 
geometric functions. It appears likely that such formulas are, in fact, special cases of 
more general transformation formulas for hypergeometric functions involving one or more 
parameters. It would be worthwhile to investigate such possibilities. 


Entry 35 /f B and the multiplier m have degree 3, then 


Pag Loe. je 6) 

a —a a(l—a)’ 

9 is l-—a a(1 —a@) 
B 


1-B Ypu- f) 


m? 


m 
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Entry 36 /f 6 and the multiplier m have degree 5, then 
B 1/4 1—£p 1/4 B(1 — B) 1/4 
(") + (724) ‘a | 
a\ 1/4 l—-a\!/4 ea) 1/4 
(3) (; or} er — 
Entry 37 [f B and the multiplier m have degree 71, then 
1/2 = 1/2 _ 1/2 - 1/3 
BO (2) i (: 4 7 @ p) ne p) | 
a l-—a a(l — a) a(l — a) 
49 (2) + (G28) " - (BEB) " s(t)" 
A 1—B Bl — B) Bu-B)} - 


Entries 35, 36, and 37 are the same as Entries 5(vii), 13(xi1), and 19(v), respectively, in 
Chapter 19 [1, pp. 230, 281-282, 314]. 


m 


5 
m 


Entry 38 /f 6B and the multiplier m have degree 9, then 
BY" (1-8) (Ba -8)\"" 
& teed ces), | 
3 a\'/ tay a(1—a)\' 
ga = (5) +=) ea) 
Entry 39 /f B and the multiplier m have degree 13, then 
pyr =: ee ee eae 
(=) < —) a(1 —a) a(l — a) 
13 7 (2) + (t28)"- (4 9)" -(eeey" 
m  \B 1—B BU — B) BU-B)} 
Entry 40 Jf B and the multiplier m have degree \7, then 
7 A)" 4 é =Ay" (#2 =a 
l-—a a(l — a) 
= 1/8 1/8 1/8 
(Fron) |G) +G=t) 
a(l—a) 1 — 
17 Ani 1/4 wibseyner 
mi (j) ae 7) ae), 
ai=ay\'8 1/8 howe 
1+ i 
Gam) |G) *G=5) f 


a 


3 
| 


= 
| 


2 
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Entries 38, 39, and 40 are, respectively, Entries 3(x), (xi), Entries 8(111), (iv), and Entries 
12(ili), (iv) in Chapter 20 [1, pp. 352, 376, 397-398]. 


Entry 41 /f B and the multiplier m have degree 25, then 
1/8 =. 1/8 _ 1/8 _ 1/12 
fi = (2) +(: s) “(a ay -2(8 2 | 
(og l-—a a(l—a@) a(l—a) 


5 (3) + (23) "- (ey "- (ey 
vm \B 1—B BU — B) B(1 — B) 


Entry 41 is identical to Entries 15(1), (11) in Chapter 19 [1, p. 291]. 


3 A Fragment on Page 349 


By introducing a new parameter, Q, Ramanujan found simpler forms for some old modular 
equations and found some new ones as well. Each degree n satisfies the congruence n = 7 
(mod 16). Set 


P = 1—{ap}'/® —{((11 -aw&) (1 — B)} 8, 


l _— _ 
p22 OE fap (a B® (42.1) 


+{aB(1 —a)(1 — p)}!/8, 
R = 4{ap(1 — a)(1 — p)}'”*. 


Entry 42 /f B has degree 7 and P, Q, and R are defined by (42.1), then 
P?=Q=0. 

Both equations above are in Entry 19(i) of Chapter 19 [1, p. 314]. 
Entry 43 /f B has degree 23 and P, Q, and R are defined by (42.1), then 

The equality P = R!/? is Entry 15(i) in Chapter 20 [1, p. 411], while the equality 
P? = Q, after some elementary manipulation, can be shown to, be equivalent to Entry 
15(i1) in Chapter 20 [1, p. 411]. 
Entry 44 /f P, Q, and R are defined by (42.1) and n = 39, then 

O(P* — Q) = PR. 

Entry 45 /f P, Q, and R are defined by (42.1) and n = 55, then 


Q(P? — Q)? = R(P° — R). 
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Entry 46 /f B has degree 7\ and P, Q, and R are defined by (42.1), then 
P?—Q = PR'?, 
Entry 46 is identical to Entry 23(ii) in Chapter 20 [1, p. 444]. 
Entry 47 If P, Q and R are defined by (42.1) andn = 119, then 
(P? — Q)* = OR'?(P — R'”). 


At this moment, we only have proofs of Entries 44, 45, and 47 using the theory of modular 
forms, which we do not give here. Hopefully, proofs in the spirit of Ramanujan can be 
given at a later date. 

Ramanujan also listed the numbers 103 and 167, but he did not give modular equations 
for these degrees. 
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The abc-conjecture 


Jerzy Browkin 


In the present paper we give a survey of the abc-conjecture and of its modifications and generalizations. We 
discuss several consequences of the conjecture. At the end of the paper there are given numerical examples 
giving some evidence for the conjecture. 


1 Introduction 


1.1 The abc-conjecture 


In the present paper we discuss the abc-conjecture on an elementary level, we do not say 
much about connections of the conjecture with more advanced theories. 

We shall give the precise form of the abc-conjecture below (see 2.1), here we state it in 
a preliminary vague form, to explain the main idea lying behind it. 

For any natural number n, let r(n) = [| pin P be the product of all distinct prime divisors 
of n. Thus r(n) is the maximal squarefree divisor of n, we call it the radical of n. 

We say that a positive integer n is very composite, if r(n) is “small” with respect to n, 
or equivalently, if es is “large’’. 

E.g. if 


ny = 241.3!) .7.113 .97!9 = 21679141557120823565001056764315386261798912, 


then r(n) = 2-3-7- 11-97 = 44814 and po87. = 9.316746. 
Similarly, if 
ny = 2°-3-77-11-97 = 1254792, 


then r(n) = 2-3-7- 11-97 =r(n}) and eee = 1.311122. 


Therefore n; is very composite, and 72 1s not. 


The abc-conjecture (a preliminary form). /fa + b = c, where a, b are relatively prime 
natural numbers, then the number abc cannot be very composite. In particular, the sum of 
two relatively prime very composite numbers cannot be very composite. 


Thus the abc-conjecture expresses some relation between the addition and the multipli- 
cation in the ring of integers. 


1991 Mathematics Subject Classification. 11D04, 11D41, 11C08. 
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1.2 Consequences of the abc-conjecture 


It turns out that many deep theorems concerning integers, and some known conjectures in 
Number Theory follow easily from the abc-conjecture. We shall discuss this in 2.3. Let us 
mention here the following areas where there are known several interesting consequences 
of the abc-conjecture: 


a) Diophantine equations and inequalities, 
b) Elliptic curves, 
c) Polynomials. 


Therefore it seems to be difficult to prove the abc-conjecture, one may even doubt if it 1s 
true. We can rather suggest to replace the abc-conjecture by some weaker one, which also 
implies many statements in Number Theory, but which could be easier to prove. We shall 
discuss weak forms of the abc-conjecture in 2.4 and 2.5. 


1.3 Modifications and Generalizations of the abc-conjecture 


Since it seems to be hopeless to attack the abc-conjecture directly, one can try to understand 
it better by considering some of its modifications and generalizations. 

E.g. we can replace the ring Z of rational integers by the ring of algebraic integers of 
an algebraic number field of a finite degree over the field Q of rational numbers, or by the 
ring of polynomials k[x] over a field k of characteristic zero, or, more generally, by a ring 
of algebraic functions of one variable. 

Another generalization is the following. In the abc-conjecture we consider sums of two 
relatively prime integers; we can consider sums of n (where n > 3 is fixed) relatively prime 
integers, or of n pairwise relatively prime integers. 

We shall discuss these generalizations in part 3. 


1.4 Limit Points 


One of the forms of the abc-conjecture states that the maximal limit point of the set £ of 
real numbers (defined below, see 2.1) is equal to 1. One may ask, more generally, about all 
limit points of CL. 

It has been proved recently (see [BFGC], [GN]) that every point in the interval [5, 32] 
is the limit point of the set £, but still we cannot prove that | is a limit point of £ (not 
necessarily the maximal one!). 

We shall discuss this topic in 4.1. 

Of course, one can consider simultaneously some different points of view mentioned 
above (e.g. to investigate limit points in the case of the ring of polynomials etc.). 


1.5 Computations 


We cannot prove that | is the maximal limit point of the set C, but we can give numerically 
many elements of C£ (but not infinitely many!). Therefore we can look for large numbers 
belonging to L£, to get some evidence for the abc-conjecture. 
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At the end of the paper we give the table of all known numbers > 1.4 belonging to CL, 
and the corresponding values of a, b,c. 


1.6 References 


There are several expository papers devoted to the abc-conjecture (see e.g. [La], [Ni]) 
containing long lists of references. In the references at the end of the paper we include a 
large number of important papers on the subject, but evidently our list is not complete. 

Let us mention that there exist (see e.g. [Vo]) more general conjectures, from which 
the abc-conjecture follows. One can interpret this in two ways: either it suggests that the 
abc-conjecture is true, or that these more general conjectures are false. 
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2 The abc-conjecture and some of its Consequences 


2.1 Statement of the Conjecture 


For relatively prime natural numbers a, b denote 


loge 
L = Lia, b) = 
log r(abc) 
where c = a + b, and r(n) is the radical of n defined above. 
Let 
L={L(a,b) : a,beN, ged(a, b) = I}. 


The abc-conjecture The maximal limit point of the set L is equal to |: 


limsup L(a,b)= 1. 
gcd(a.b)=1 


Equivalently 
i V \ c <r(abc)'**, 
E>0 N a,b>N 
gcd(a,b)=1 
or 


AV \ c < Cer(abc)'t*. 


E>0 C, a.beN 
gcd(a.b)=1 
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2.2 The Present Status of the abc-conjecture (August 1999) 


1. 


The largest known number belonging to L has been found by E. Reyssat [We]. It 
corresponds to the equality 2 + 3!9 . 109 = 23°. Namely 


log(23°) 


L323 109) = — 
( ) log(2 - 3 - 23 - 109) 


= 1.629912 


Here r(abc) = 2-3-23- 109 = 15042 is “small” with respect to 23> = 6436343. 


. We have 


limsup L(a,b) > 1. 
gcd(a,b)=1 
In fact, for every n > 1, we have 9” = 1 + 8t,, for some t, > 1. Consequently, if 
a=1, b=8t,, c = 9", thenr(abc) < 6t, < 9" =c, and hence 


logc 


L(a, b) = ———— 
a?) log r(abc) is 


. The table of all known pairs (a, b) such that L(a, b) > 1.4 is given at the end of the 


paper. Some of them have been found recently (compare the earlier table in [BB]). 


. J. Kanapka [Ka] determined all pairs (a, b) such that gcd(a, b) = 1, L(a, b) > 1.2, 


anda+b=c < 2°°. 


. C.L. Stewart and Yu Kunrui [SY1], [SY2] proved that, for some effective constant k 


and every pair (a, b) of relatively prime natural numbers with c = a + b we have 
ee k 
log c< r(abc)3 loglogr(abc) | 


The inequality in the abc-conjecture is much stronger. 


. On the other hand C.L. Stewart and R. Tijdeman [ST] proved that, for every 6 > O, 


we have 
J log r(abc) 


I l b 4—$5)———_., 
sa a Mosley he) 


for infinitely many pairs (a, b) of relatively prime natural numbers and c = a + b. 


2.3 Some Consequences of the abc-conjecture 


We state here some theorems and conjectures which follow from the abc-conjecture. For 
most proofs we give references in the literature. We shall include some proofs later, see 2.4 
and 2.5, when we shall discuss the weak forms of the abc-conjecture. 


2.3.1 Fermat’s Last Theorem 


From the abc-conjecture it follows that there is only a finite number of positive integers 
n,X, y, Zz satisfying n > 3, gcd(x, y) = | and 


x” + y" = rales 
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Proof: Leta = x", b= y", c =z". Then 


4 
OR ee ae sh Se. 
log r(xyz) log z a 3 
The finiteness follows from lim sup L(a, b) = 1. O 


2.3.2 The Equation n! + 1 = m? 
It is easy to verify that 4! ++ 1 = 5* and 5!4+ 1 = 117. 


Theorem 2.1 (M. Overholt). From the abc-conjecture it follows that the equation 


ni+1 =m? 
has only a finite number of solutions. 
Proof: See [Ov] and 2.5.2. Oj 
2.3.3 Hall’s Conjecture 
M. Hall, Jr. [Ha] conjectured that 
ps bey, 


for positive integers x, y with x2 4 y*, where C is a positive constant. 
On the other hand the result of L.V. Danilov [Da] improved by A. Schinzel [Si] says that 
for infinitely many positive integers x, y satisfying x2 4 y> we have 


222 al 

25/5 

From the abc-conjecture it follows the weak Hall conjecture: For every ¢ > O there exists 
C(eé) > O such that 


Ix7— yy] < 


yay SCE. 


for every positive integers x, y satisfying x* 4 y°. 
Proof: See [Ni] and 2.5.3. CJ 
2.3.4 The Congruence a?-! =1 (mod p*) 
There are known only two prime numbers p satisfying 
2?! =1 (mod p’), 


namely p = 1093 and p = 3511. 
Nevertheless we cannot prove that the number of such primes is finite. A weaker result 
in this direction follows from the abc-conjecture: 
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Theorem 2.2 (J.H. Silverman). The abc-conjecture implies that for every a > 2 there are 
infinitely many primes p satisfying a?~' #1 (mod p?*). 
Moreover the following quantitative result holds: From the abc-conjecture it follows that 


#{p<X :a?-!'£1 (mod p’)} > C(a) log X, 
for some positive constant C(a). 
Proof: See [Sil]. C 


2.3.5 Mordell’s Conjecture 


Mordell’s conjecture proved by G. Faltings states that on every algebraic curve of genus 
g > 2 defined over an algebraic number field K there are only finitely many K-rational 
points. 

N. Elkies [El] proved that Mordell’s conjecture follows from the abc-conjecture for 
algebraic number fields (see 3.1 below). 


2.3.6 Class Numbers of Imaginary Quadratic Fields 


Let —d be a negative fundamental discriminant. A. Granville and H. Stark [GS] proved that 
the uniform abc-conjecture (see 3.2. below) implies the following estimation from below 
for the class number of the field Q(./—d). 


It Jd l 
h(—d) = (F +0(1)) moe 


where the sum runs over all reduced quadratic forms ax* + Bxy + yy? of discriminant —d. 
Much weaker result has been proved unconditionally (see [GZ]): 


h(—d) > C(e) (logd)!~*, 
for every ¢ > 0 and an effectively computable constant C(e) > 0. 


2.3.7 Squarefree Values of Polynomials 


It is not known any polynomial f € Z[x] of degree > 4 such that f(m) is squarefree for 
infinitely many m é€ Z. 

On the other hand from the abc-conjecture it follows that the polynomials (x” —1)/(x—1) 
and the n-th cyclotomic polynomial ®, (x), forn > 1, take infinitely many times squarefree 
values, see [BFGS]. 

This result has been strengthened considerably by A. Granville ({Gr]): Assume that the 
abc-conjecture is true. Suppose that f(x) € Z[x] does not have multiple roots. Let 


B = gced{f(n) : ne Z}, 


and let B’ be the minimal divisor of B such that B/B’ is squarefree. 
Then f(m)/B’ is squarefree for infinitely many m € Z. 
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2.3.8 Elliptic Curves 


Let E be an elliptic curve defined over Q. Denote by N its conductor, and by A its minimal 
discriminant. 
Then the Szpiro conjecture (see [Oe], [Sz]) says that 


IA] < C(e) N®&*E. 


Moreover there is the Goldfeld-Szpiro conjecture (see [GS]) giving an estimate of the ordei 
of Tate-Shafarevich group LL of the curve E 


#LLI < Ce) N 1/28, 


It is known (see [GS]) that both these conjectures are equivalent provided the Birch- 
Swinnerton-Dyer conjecture is true. 
Now let a, b € N be relatively prime integers, and let a + b = c. We consider the elliptic 
curve 
y? = x(x —a)(x+b). 


One can prove (assuming some mild conditions on a, b) that its conductor N equals r(abc), 
and its minimal discriminant A equals (abc)? (up to absolutely bounded powers of 2). 
The Szpiro conjecture (for these curves only!) follows easily from the abc-conjecture. 
Namely we have 
A = w(abc)* < wc®, N =vr(abe), 


where jz, v are powers of 2 bounded by absolute constants. Hence from the abc-conjecture 
c < C(e)r(abc)'t* we get 


A < pc? < uC(e)*r(abc)®'+® = wC(e)® NeaRE Pye ee < C'(e')NOtE 


where €’ = 6. 

Therefore one may expect that extremal examples for the abc-conjecture give elliptic 
curves with A and #ILLI large with respect to N. In fact B.M.M. de Weger [Wel] has 
computed many examples which support this expectation. 

E.g. the best known example for the abc-conjecture found by E. Reyssat (a = 2, b = 3!9. 
109) gives the elliptic curve with #LL] = 361 = 19? and #LL/N'/* = 0.7359, but the 87. 
example in the table at the end of the present paper corresponding toa = 7°, b =5'3-181 
defines the curve isogeneus to a curve with #LL] = 50176 = 224? and #LLI/N'/? = 6.983. 
It was the largest known value of #LL] j N'/*_ On the other hand de Weger conjectured that 
for every € > O there is a C(€) > O such that there exist infinitely many elliptic curves 
defined over Q satisfying 

#1L1 > C(e) N1/2-€. 


Recently A. Nitaj [Nil] has found the elliptic curve 


Y74+XY+Y = X24 X* — 353297842869446461 1454768601LX 
— 80827582979574301299537222938555582992327, 


with #LL] = 18327 and #LLI/N!/2 = 42.265. It corresponds in some way to the example 
18. in the table at the end of the present paper. 
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2.4 Weak Forms of the abc-conjecture 


Now we shall discuss some weaker forms of the abc-conjecture which also imply many 
theorems and conjectures in number theory. 

Namely, we replace the radical r(n) = []_,,,, p by some larger number, denote it by r’(n) 
for a moment. Then we consider the corresponding function 


A eee 
logr’(abc) 


where a, D run over relatively prime natural numbers, and c = a + b. 
The weak form of the abc-conjecture says that 


lim sup L’(a, b) < 1. 
We look for such functions r’(n) that the weak abc-conjecture implies the same statements 
as the abc-conjecture itself. Of course, proofs of such implications are not so easy in general 
as the proofs in the case of the abc-conjecture. 


To make our discussion more precise, we introduce the following notation. Let f : N > 
R be a function satisfying 


1 < f(k) <k, forevery k EN, 
and a condition on the growth at infinity: 


lim f(k)/k =a, where a > 0. 
k-—>0o 


Then we define the multiplicative function rs : N — R by the condition 
rp(p*) = p!™, f b dk>1 
f(p") = p’’, for every prime number p, andk > I. 
For relatively prime natural numbers a, b letc = a+ b and 


logc 
L (a,b) = ——-_—. 
aaa CT ES 


The weak abc-conjecture (with a given function f as above) says that 
lim sup L ¢(a, b) < 1, (2.1) 
where a, b run over all relatively prime natural numbers. 


2.4.1 Basic Properties of L ¢ 


Lemma 1 /f limg—o0 f(k)/k = 0, then limsupL (a,b) = 1, ie. in the weak abc- 
conjecture (2.1) we can replace the inequality by the equality. 
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Proof: For every natural number n we have 
32" —142"t2, where (1,6) = 1. 
Puta =1, b=2"*21, c= 32". Since re(t)<t< 32" gnte we get 
r¢(abc) = rf(2"**)r¢(t)r¢(37-) < 2" 471 1737") < 37 - 3f@, 
Hence 


n 
loge 7 2" log 3 1 


L ,b SS Se Se ae ae ee ra 
(a?) logr¢(abc) ~ 2" log3 + f(2") log 3 1+ aCe) 


—> | 


as n — OO. OC 
Lemma 2 /f for some O < B < 1, we have f (k)/k > B, for everyk €N, then 


l 
li Le(a,b) < —. 
im sup L ¢(a, b) < Y; 


In particular, if f(k)/k = 4 fork €N, then limsup Ly(a,b) < 1, ie. the weak abc- 
conjecture is a theorem. 


Proof: From f(k) = kB it follows that r s(n) = n? for every n € N. Therefore assuming 
thata < b we get 


log(a + b) 2 log(a + b) c log 2b l 


a logrs(ab(a + b)) ~ Blog(ab(a + b)) — B log b2 “a 2B 


as b — oo. Hence lim sup L ¢(a, b) < Th 
2.5 Some Consequences of the Weak abc-conjecture 


2.5.1 Fermat’s Last Theorem for Large Exponents 


Theorem 2.3 The weak abc-conjecture with f satisfying limgsoo f (k)/k = a < 1/3 
implies Fermat’s Last Theorem for large exponents. 


Proof: For every 6 satisfying a < B < 1/3, from the assumption we get 
S(k)/k < B for k > ko. 
Hence for every m = |],,,,, p”? and every n > ko we have 
r¢(m") = I] pir”) < [] pmenP — ml"? | (2.2) 
p\|m p\m 


Suppose that relatively prime natural numbers x, y, z satisfy x" + y" = z”, forsomen > Ko. 
Pia=2°. b=] yy". Cc. 
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rg(abc) =rg(x")rpe(y)rp(2z") S (xyz)" < orb 


Consequently 
loge log 2” l 
L : b) = ——————_ >> =_-—, 
fi4y?) logrs(abc) ~ logz3"® = 3B 
l 
and lim sup L f(a, b) > 3B > 1. Contradiction. CJ 


2.5.2 The Equation n! + 1 = m? (see [Ov]) 


Theorem 2.4 Assume that f(2k) < 2kq for some q < 1 and every k € N, and that 
limg-so0 f(k)/k = 0. Then from the weak abc-conjecture with the function f it follows 
that the equation n! + 1 = m* has only a finite number of solutions in natural numbers 
m,n. 


Proof: For every natural number m = [],,,, p’”” from the assumption we get 


p\|m 


rp(m?) =| | pl?) =<] | pet = m4, (2.3) 


p|m p\m 
To prove the theorem it will be sufficient to prove that 


. log(rg¢(n!)) 
im ——— = 
noo = logn! 


(2.4) 


In fact, ifn! + 1 = m?, then 


l 2 l 2 
Ly(ni,l) = ee > poe lS en 
logr¢(m?) + logrs¢(n!) ~ log(m4) + logr¢(n!) 
7 l 
7 log rf (n!) = q 
ca log(n!+1) 


as n — oo in view of (2.3) and (2.4). Consequently lim sup LZ ¢(n!, 1) = 7 > 1, where n 


runs over all solutions of the equation n! + 1 = m*, provided the number of solutions is 
infinite. Contradiction. 
To prove (2.4) we shall use the Stirling formula 


logn! = nlogn+ O(n), 


and we shall estimate log r s(n!) from above. 
It is well known that 
ni= I] prin) 


pn 
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where 


(2.5) 


and s,(n) is the sum of digits of n written in base p. 
We shall write logr;(n!) as the sum of two summands, and we shall estimate them 
separately. 


logrs(n!) = )> flap(n))logp = > f(ap(n)) log p 


psn p<n/logn 
+ > flap(n)) log p. 
ine PS 


For p < n/logn, in view (2.5) we have 


ee n — Sp(n) . n — log n a a (logn)? 
p-1 n/logn n log 2 
asn — OO. 
Then 
a,(n 
g(n) := Lap) p()) — 0 


p<n/logn p(n) 
as n — OO in view of the assumption of the theorem. 
To get the estimation of the first summand we use the inequality 


n— Sp(n) 2 2n 


= 2.6 
a p(n) p= D (2.6) 
and the well known formula (see [Pr]) 
l 
y= — logx + O(1). (2.7) 
psx 
We have 
log p 
SY) flap(n))logp < SY gin)ap(n) logp < 2ng(n) >> 
p<n/logn p<n/logn px<n/logn P 
= 2ng(n) (log(n/ logn) + O(1)) < 2ng(n) logn + O (ng(n) log log n) 
= o(n logn). (2.8) 


To estimate the second summand we use (2.6) and (2.7) once more. We have 


: 2n 
S> flap(n))logp< > = esp 
ipa ee" ioe P= 
= 2n(logn — log(n/logn) + O(1)) = 2n log logn + O(n) = o(nlogn). (2.9) 


From (2.8) and (2.9) we get (2.4). CI 
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2.5.3 Hall’s Conjecture 


We shall prove that the weak Hall’s conjecture stated in 2.3.3. follows also from an appro- 
priate weak abc-conjecture. 


Theorem 2.5 Assume that f is not decreasing and satisfies 
f(2k) <k and f(3k) <k, for keEN. 
Then the weak abc-conjecture with this function f implies the weak Halil’s conjecture. 


Proof: For natural numbers x, y satisfying x2 4 y? let d = gcd(x?, y>), and put 
a= x2 ~y|, b= min(x?, y®), ¢= : max(x?, y°). 
d d d 


Then a + b =, and gcd(a, b) = 1. 
Since f is not decreasing, from the divisibility m|n it follows that r¢(m) < rf¢(n). 
Moreover the assumption of the theorem implies that 


rp (x?) <x and r p(y) < y. 
Consequently 
rf (=) <rp(x?) <x 
and | 
ry (5) <rp(y”) <y. 


Hence rf(b)rp(c) < xy. 
From the weak abc-conjecture we get, for some C;(€) > 0, 


l l+e 
c < Ci(e)r¢(abe)'** < Cie) (x51? — »*1) 


and consequently 
x? < Cy(e)(xy|x? — y3|)!**, 
y? < Ci(e)(xy|x? — y3 it. 


Multiplying these inequalities we obtain 
xy = Cite) Gayle ay lo, 


and hence 
Ix? _ ry led > Co(e)x7* y!-, 


For x < y? it follows that 
Binns, yr |—6¢e 


|x > C2(é)y 
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Consequently 
Ix? — | > Cae)y'?*, 
7 
Wr). 
If x > y* we have obviously 


with e’ = 


Parley = Syeo* 


for y > 1. CI 


3 Modifications and Generalizations of the abc-conjecture 


3.1 The abc-conjecture for Algebraic Number Fields (see [El]) 


For an algebraic number field K of a finite degree over Q let || - ||) run over all normalized 
valuations of K, in particular let || - ||p be the non-archimedean valuation corresponding to 
a prime ideal p of the ring of algebraic integers of K. 

For a,b,c € K* satisfying a + b = c let 


hx (a,b,c) = | | max((lallv, [lbllv, Helle). 
Vv 


and let 
rx(a,b,c)= |] Nx at), 


péelx (a,b,c) 


where Ix (a, b, c) is the set of all prime ideals p of K such that not all numbers |la||p, ||Dllp, 
|cllp are equal, and Nx /g 1s the absolute norm. 

It is easy to see that for K = Q anda,b,c € N with gcd(a, b) = 1 anda+b=cwe 
have 


hxk(a,b,c) = c 

rx(a,b,c) = r(abc). 
This suggests the following definitions. For an algebraic number field K anda,b,c € K* 
with a + b =c, where not all a, b, c are units in K, let 

logh , b, 
(apa ee 
log rx (a, b,c) 
and let 
Lr= {Lx (a, b) : a,b,ce K*} ; 


The abc-conjecture for an algebraic number field K. The maximal limit point of the set 
LK is equal to 1, or equivalently, for every ¢ > O there exists a constant Cx ,- depending 
on the field K and € only, such that 


hx(a, b,c) < Cx.erx (a,b, cite, 


for every a,b,c € K* satisfyinga+b=c. 
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Now we simplify the definition of the set £x . From the product formula: 
] [dil =1 
VU 


for every d € K*, and from the definition of the set Ix (a, b,c) of prime ideals of K it 
follows that the functions hx and rx defined above are scaling invariant: 


hx (da,db,dc) =hx(a,b,c), rx(da,db,dc) =rx(a, b,c), 


for every d € K*. Therefore we can assume e.g. that b = —1. Then fora € K*, a 41 
we get 


hx(a,b,c) =hk(a,-t,a—t) = [] max(fally, tl —ally)- [] lal. 
v—arch. if 
a p> 


Similarly 


rx(a,b,c)=rx(a,-l.a-1)= [] Nx ol), 
pel (a) 


where [x (a) is the set of prime ideals p of K such that ||a||)> # 1 or ||l —allp < 1. It follows 
that rx (a, —1,a — 1) = Liffa(l —a) is aunit. 

To simplify the notation we denote by K° the set of alla € K* such that a € 1 and 
a(l — a) is not a unit. 

Therefore it is natural to consider the following functions of one variable 


h(a) =hx(a,—-l,a-—1), rrxe(a)=rx(a,—l,a-—1) 


and 
log hx (a) 
Lx (a) = —————_,, 
log rx (a) 
where a € K°. 
Then we get 


Le ={Lx(a) : ae K’}. 


Let us note that 
hx(a) =hx(l —a) =hx(l/a) and re(a) =rxK(1 —a) = rx(1/a). 


3.2 The Uniform abc-conjecture 


Now we state the abc-conjecture for the field of all algebraic numbers. 
_ If K Cc M are algebraic number fields (of finite degrees over Q, as usual) anda € K°, 
then in general L x (a) defined above is not equal to Ly (a). 

Moreover, it is easy to see that the set Ux £x, where K runs over all algebraic number 
fields, is not bounded above. 
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E.g. for K = Q(/2) anda = —1 we have Lx (—1) =n. In fact, 


hx(-1) =[[max(1, 2b) = [] U2 = 2". 


v—arch. 


and rt (—1) = 2, since || — Illp = [lp = |[2llp does not hold iff p is the unique prime 
ideal of K dividing 2, then Nx /g(p) = 2. 

Therefore it is desired to modify the function Lx in such a way that the new function, 
say (ee , will be independent on K, 1.e. 


Lx(a) = Lm(a), fora € K°. 


We should take into account the ramification of prime ideals in Q(a). We modify the 
definition of [x (a) and rx (a) as follows. 
For a € K, let Jx(a) be the set of all prime ideals of K which divide prime ideals of 
Q(a) ramified in Q(a). 
Let 
Ix (a) = Ix (a) VU Ix (a). 


From this definition it follows that for K C M anda € K, and for prime ideals p and $B of 
K and M respectively such that ‘8 | p we have 


pe Jx(a) iff Be Jy(a). 


Similarly 

pelx(a) iff Be Iu(a), 
and hence a _ 

p € Ix (a) iff ‘Be Ty (a). 
Let us observe that /x (a) is not empty foreverya € K*, a #1. 

Define 
Fx(a)= [|] Nxop**/0) for ae K*,a#1, 
pelx (a) 


where ex /g(p) is the ramification index of the ideal p. Then rx (a) > 1. 


Denote 


Lx(a) = ENG) for ae K*,aFl. 
log rx (a) 


Theorem 3.1 /f K C M are algebraic number fields anda € K*, a #1, then 
Lx (a) = Lu(a). 
Proof: From the definition of normalized valuations it follows that 


hy (a) = hx (ay“*), 
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For prime ideals p and 8 of K and M respectively, such that $B | p we have 
pelx(a) iff Be Iy(a) 


as observed above. 
Moreover, for fixed p, 


I] NM/K (SB)CMIK CB) — I] pim/K Bem/K (B) — p(M:k) 
a Plp 
where f/x (PB) is the residue field degree, since 
De ffuyK B)emK (CB) = (M: K). 
Bip 
Consequently 


Fu(a) = Nujol [] pewe® 


Bely (a) 


Nxjo | Nm/xK I] | [sour 
pelx(a) Pip 
(M:K) 


NK/Q I] p°K/Q) 


pelx (a) 
)(M:K) 


TK (a 


since €y Q(B) = emK (B) - €xa(p)- s 
The result follows from the definition of Lx. CO 


We denote the function L x simply by L since it does not depend on K. It is defined for 
all algebraic numbers a £ 0, 1. 
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lim sup L(a) = 1, 


where a runs over all algebraic numbers # 0, |. 


Example: The maximal known values of L have been discovered by N. Broberg [Bro]. 
They correspond to the equalities 


lL. (=W2" +62 = 01402)" L = 2.543107 
2. (V2 —19°3 — V2) + (V2 4+ 1°34 V2) = (V2)! L = 2.232501 
Ey 4) Sy L = 1.967616 
4. 14+ (5+2V6)> = (V6 — 2)7(3 + V6)? L = 1.919721 
§ GO 1)! 66? = 1) = Gy 3 137 L = 1.893627 
6. (5 —2V6)§ + (54+ 2V6)(V6 — 2)9(3 — VO)(1 + V6)(V6 — 1) +72 = 1 L = 1.714895 
7, AED 4 V7 = ASEH L = 1.707222 
8. (5 +2V6)2(/6 — 2993 — V6)? - 114 (5 —2V6)® = 1 L = 1.641494 
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3.3 The abc-conjecture for Fields of Algebraic Functions of One Variable 
3.3.1 The Field of Rational Functions 


The above considerations concerning fields of algebraic numbers can be repeated with 
appropriate changes in the case of any global field of characteristic zero. 

We begin with the field k(x) of rational functions over a field k of characteristic zero. All 
normalized valuations || - ||, trivial on k have the form || -||,, where p runs over all irreducible 
polynomials in k{x], and ||- ||1;,. Foreverya € k(x)* we have |la||, = exp(—v)(a)-deg p), 
where v,(a) is the p-adic valuation of a, and |la||1/, = exp(—v}/,(a)) = exp(deg a). 

Then, fora, b,c € k(x)*, satisfying a + b = c we define 


h(a, b,c) = | | max(llally, lly, lIelle). 
VU 


Moreover let 


r(a,b,c)= |] Ipllp'=exp[deg} [] pf). 


pel (a,b,c) pél (a,b,c) 


where / (a, b, c) is the set of all irreducible polynomials p € k[x] or p = 1/x such that not 
all numbers |la||p, lDll p, llcllp are equal. 

Since h(a, b,c) and r(a, b,c) are scaling invariant, we may assume that a, b,c are 
relatively prime polynomials. 

Then 

h(a, b,c) = exp(max(deg a, deg b, deg c)) 
and 
r(a, b,c) = exp(deg(rad(abc)) + 6), 6 =Oorl, 


where rad( f) is the maximal squarefree factor of the polynomial f € k[x]. In other words, 
rad( f) is the number of distinct zeros of f in the algebraic closure of the field k. 


Define 
log h(a, b, c) 


L(a, b,c) = 
ae: logr(a, b,c) 


provided not all a, b, c belong to k. 
Then for relatively prime polynomials a, b, c we have 


max(deg a, deg b, deg c) 


L(a,b,c) = 
re deg rad(abc) + 6 


In this situation L(a, b,c) < 1 always holds. Namely 


Theorem 3.2 ({Sto], [Mas]). For relatively prime polynomials a, b over a field k of char- 
acteristic zero, not both belonging tok, andc =a +b, we have 


max(dega, degb, degc) < degrad(abc) — 1. 
Proof: See e.g. [La]. C 
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3.3.2 The Field of Algebraic Functions of One Variable 


Let K be a field of algebraic functions of one variable over the field of constants k of 
characteristic zero. We assume for simplicity that the field k is algebraically closed. 

Fora € K \k let dega be the degree of the divisor of zeros of a, then dega = (K : k(a)). 
For a € k* we put dega = 0. 

For a prime divisor p of K we define the normalized valuation ||a||p = exp(—vp(a@)), for 
a € K*, where v,(a) is the multiplicity of zero of a at p. 

Then, fora € K*, a #1, we put 


hx (a) = | | max(llallp, ll Illp. [1 — allp) = exp(dega). 
p 


Let Ix (a) be the set of prime divisors p of K such that not all numbers v,(a), 1, vp(1 — a) 
are equal. Then /x (a) is the set of zeros and of poles of a, and of zeros of 1 — a. 

We define 

re(a) =exp(#/x(a)) for ae K*,aFl, 

and 
loghx(a) | dega 
logrx(a) #1 (a) 
In this situation Lx is bounded from above by a constant depending only on the genus gx 
of K. Namely 


Lx(a)= 


for ae K \k. 


Theorem 3.3 /f K is a field of algebraic functions of one variable over an algebraically 
closed field of constants k of characteristic zero, anda € K \k, then 


dega < #lx(a)+2gK —2. 


Consequently 
Lx (a) <1, ifexr =), 
= 1, ifgx =1, 
< I+ $(gx — 1), ifgx > 1. 


Proof: (cf. [El]). Fora € K \ k and a prime divisor p of K denote by a(p) the residue of 
a modulo p if p is not a pole of a, and put a(p) = oo otherwise. 
For every a € k U {oo} we have 


#(p : a(p) =a} =dega— > (ex/ka)(p) — 1), 
p 
a(p)=a 


where €x /k(a)(P) is the ramification index of p. 
Therefore 


#I1x (a) =#{p : a(p) € (0, 1, co}} = 3dega — \- (€x ka) (p) — 1) (3.1) 


p 
a(p)e{0, l,oo} 
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On the other hand we have the Riemann-Hurwitz formula 


2(gK — 1) = 2(K : k(a))(gkay — 1) + D(eK/K(ay(P) — D). 
p 
where p runs over all prime divisors of K. 
Since gk(q) = O and (K : k(a)) = dega we get 
2dega = 2(1 — gx) + > (eK /K(a(P) — 1). 
p 


Then from (3.1) it follows that 


#Ik(a) = dega+2dega — » (€x/k(a)(p) — 1) 
pene 


dega+2(1— gx) + > (€x/k(a) (Pp) — 1) 


p 
a(p) ¢{0, 1,00} 


IV 


dega + 2(1 — gx). 


Thus 
dega #1 x(a) — 2. if gx =O, 
#1x (a), ifgx =1, 


#1k (a) + 2(gx — 1) if gx = 2. 


IA IA IA 


Since each of a, 1/a, and 1 — a has at least one zero, we have #/x (a) > 3. Therefore 


dega Z 
7 ates On): oO 


Lx(a) = #Ix(@) = 


We shall not discuss here the uniform version of the abc-conjecture for the field of all 
algebraic functions of one variable. We remark only that if K C M are fields of algebraic 
functions of one variable, anda € K \k as above, then L K(a) = L m(a), for some function 
L defined analogously as in the number field case. 


3.4 The n-conjecture 


In the abc-conjecture we consider all sums a + b + (—c) equal to zero with non-zero and 
relatively prime summands. We can consider analogous sums with n summands, where 
n> 3. 


3.4.1 The n-conjecture for the Ring Z of Integers 


Let aj,...,@n € Z, n > 3 satisfy 


(i) gced(aj,...,an) = 1, 
(ii) ay +az+---+a, =0, 
(iii) No proper subsum of (ii) is equal to zero. 
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Define a 
og max <j<y |@j 
L(a1,...,@n) = stad abacus? Ee a 


> 


log r(aj ---@n) 


where r(m) is the maximal squarefree divisor of m. 
The n-conjecture for Z. 


lim sup L(@},...,@,) = 2n —5, 
where (aj, ..., @,) runs over all n-tuples of integers satisfying (i)—{ili). 
Theorem 3.4 ({[BB]). Under the above assumptions (i)-(iii) 
lim sup L(aj,...,@,) > 2n —5, for n>3. 


Proof: The polynomial 
k 
KO). sage. 


j=0 
where 
2k+1 f(k+jt+ ') 
$j = EZ, 
k+j4+1\ 2j4+1 
satisfies 5 es 
(x — 1) x — | 
x* f, (“—) a (3.2) 


Substituting k =n — 3, x = 2! in (3.2) we get 
n—3 
gt (2n—S) =e l —_ Yo sn—3 04 a poe) a. 0. 
an 


It is the sum with n summands satisfying (i)—(iii). The summand with the maximal absolute 
value is 2'(2”—) and the product of distinct prime divisors of all summands does not exceed 


n—3 
2(2' —1)-s, where s = | | se—3.7. 
j=0 
Consequently 
log (2! (@"-5)) t(2n — 5) log2 
> > _ — 21-5, 
log(2(2' — 1)s) ~ tlog2 + log 2s 
ast —> OO. CI 


Let us remark that in the case of the n-conjecture for Z and n > 3 no estimations from 
below and from above of max(|a}|,..., |a,|) by means of a function of r(a;---a,) are 
known. IL.e. the results of C.L. Stewart and Yu Kunrui, and of C.L. Stewart and R. Tijdeman 
mentioned in 2.2. above have not been generalized to any n > 3. 
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3.4.2 The n-conjecture for the Ring of Polynomials 


Letaj,...,dn € k[x], n > 3, wherek isa field of characteristic zero, satisfy the conditions 
(i)-{iii) given above. 
Define P 
Max]<j<n Gega; 
L(a\, ee , an) = secs Jeena 


deg r(a| ---an) 
where r(m) is the maximal squarefree factor of the polynomial m. We assume here that not 
alla,,...,a, belong tok. 
The n-conjecture for k[x ]. 


lim sup L(aj,...,@n) = 2n — 5, 


where (a1, ..., @,) runs over all n-tuples of polynomials in k{x], not all constant satisfying 
(i)-(iil). 
Theorem 3.5 ({[BB], [BM]). Under the above assumptions (i)-{tit) 

2n —5 < limsupL(q@),...,@n) < (" ‘ ), for n > 3. 


Moreover, for n = 3 and 4 the above inequalities can be replaced by the equalities. 


Proof: Substituting k =n — 3, x = y’ in (3.2) we get analogously as before the sum of n 
summands satisfying the conditions (1)—(ii1). 

The summand with the maximal degree is y’”~>), and the product of distinct irreducible 
factors of all summands equals y(y’ — 1). Consequently 

t(2n -- 5) 
EGis 222s = 2n — 5, 
(a An) a as 

as t + ©o. 

This proves the first inequality. The second one has been proved in [BM], see also [V] 
and [Za]. 

The last part of the theorem follows from the observation that 2n —5 = ("5'), forn = 3 


2 
andn = 4. CO 


3.4.3 The Strong n-conjecture 
Now we replace the condition (i) above by the stronger one: 
(i) ged(aj,aj) = 1 forall <i<j <n, 


and we remove the condition (iii). Then we can state the corresponding conjecture as 
follows. 
The strong n-conjecture: For every n > 3 there is areal number A, such that 


lim sup L(a1,...,@n) = An, 


where (a1, ..., Gy) runs over all n-tuples satisfying (i) and (ii). 
For n = 3 and A3 = | the strong n-conjecture is simply the abc-conjecture. 
In the case of the ring of integers we have 
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Theorem 3.6 (S. Konyagin [Ko]). 


3/2, ifn > 3 is odd, 
l, ifn > 4 is even, 


lim sup L(qaj,..., an) = | 
where (a), ...,@n) runs over all n-tuples of integers satisfying (i ) and (ii). 
Proof: For n = 5 we use the identity 


6" 4+1j33—(6" -12 =6"t14141, m>1, 


i.e. we put a; = (6” + 1)3, ag = —(6" — 1)3, a = —62"+l gy =as = —1. Evidently 
the conditions (i’) and (ii) are satisfied. 

Moreover 

3 log(6” + 1 3 log(6” + 1 3 
Gis en tp as ) Se ae aa 
log rad(6(6-" — 1)) log(6“"+!) 2 
asm —> Oo. 

For any other n odd, n = 2k + 1, it is sufficient to take aj, ...,a5 as above and to put 
QA6 = ag =:::=arg = 1, and a7 = ag =°:: = A2k41 = —|]. 

If n > 4 is even, then it is sufficient to choose a prime number p > n — 2 and to define 
a, = p™, ag = —p™ + (n — 2), a3 = +--+ = an = —1, where m > 1. Then evidently the 
conditions (i’) and (ii) are satisfied, and 

m log p m log p 
L(a\,...,a,) = ——_ > ——— 
a1 n) lograd(p (p™ —n+2))~ (m+ 1) log p 
asm — OO. LJ 


In the case of the ring of polynomials we have 


Theorem 3.7 (see [SS]). /fn > 3 and a,,..., A, are polynomials in k{x\,...,x,;|notall 
constant satisfying (i') and (ii), then 


max deg a; < (n — 2)(degr(a, ---an) — 1), 
<j<n 


where r(m) is the maximal squarefree divisor of a polynomial m. 
Consequently 
lim sup L(qj,...,@n) <n—2. 


We have the following 
Example: (D. Davies). One can easily verify the polynomial identity 
Ge £9) a3" 600" 4:1) =7=]0, 


for m € N. The summands on the I.h.s. are relatively prime by pairs. 
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Therefore if we take, for a fixed n > 3, 
ay = (x" +293, ay = —x?", a3 = —6(x" +1)*, ag = 2—TN, a5 =--+ =a, = 1, 
then conditions (i’) and (ii) are satisfied, and 


3m 3 


So ae a a ee 


asm —> ©. 


From the example it follows that in Theorem 3.7 the constant n — 2 on the r.h.s. cannot 
be replaced by a constant less than 3. 


4 The Set of all Limit Points 


4.1 Limit Points in the Case of the abc-conjecture for the Ring Z of Integers 


For a, b,c € N, where gcd(a, b) = 1, andc = a+b, we defined the set £ = {L(a, b)} 
where 
L(a, b) logc 
a, SS ee 
log r(abc) 
and r(n) is the maximal squarefree divisor of a natural number n. 
The abc-conjecture (for the ring Z of rational integers) claims that 


lim sup £ = 1. 


It is known only that 1 < limsupL < ow. 

We shall consider the related more general question: What can be proved on the set CL’ 
of all limit points of the set £ ? 

In the above notation we have max(a, b) < c andr(abc) < abc < c>. Hence 


loge | 


L(a, b ==. 
“ Tes es 3 


Therefore £L’C [5, Oo}. 
We shall prove the following weak theorem concerning the set CL’ to present the main 
ideas of proofs of similar stronger theorems. 


Theorem 4.1 /n the case of the abc-conjecture for Z we have [5, 5] el; 


Proof: Let a > 1 be a fixed real number and let a € [X, 2X] be a fixed prime number, 
where X is large. 
Let b run over squarefree numbers in the interval [X°, 2X%] prime toa. Then the number 


of such b’s is at least 4 ; 
(S : =) “4 O(VK), (4.1) 
TU xX 
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Then c = a + b runs over some numbers in the interval [X + X°%,2(X + X®)]. 
Since the number of not squarefree numbers in this interval is 


(1 - =) (X + X%) + O(VX*) 


and it is less than (4.1) for large values of X, we conclude that for some b in question the 
number c is also squarefree. 

Then for these a, b, c we have r(abc) = abc. 

Since X° <c <4X% and X!+2" < abc < 16X!+2%, we getc = AX, abc = wX\+2e 
where i, jz are bounded by absolute constants. Consequently 


alog X +logarA a l 


L(a, b) = —————- —> = 
(1 + 2a) log X + log uw l+2a0 2+1/a 


as X tends to infinity. 
The set (sre : a > 1} equals to (j, 5) and the set L’ is closed. Therefore [z. 5] CL. 
C 


The best known result in this direction is 


Theorem 4.2 ([BFGS], [GN]). For the set £L for the ring Z of integers we have 


1 36 
ea 
E I< 


The proof of this theorem consists of several steps. In every step it is proved that some 


subinterval of [, 32] is contained in L’. 


E.g. to prove that [2, i] Cc L’we consider the polynomial identity 
Ge Dy Oe a): ae ) Oke ae): 


It can be proved that for fixed a > 1 and X large we can choose x € [X,2X] and 
y € [X%, 2X] such that 


xy(2x? + PY? + VIO — YGF + 2y’)/2 (4.2) 
is squarefree; it is the most difficult part of the proof. 
Then taking 
a=x(x°4+2y), b= (XP +O" — x), c= yx’ +y") 


we get gcd(a, b) < 2 andc = AX la r(abc) = vabc = 1? Gone ee where A, 4, v are 
bounded by absolute constants. Therefore (dividing by gcd(a, b) if necessary) we obtain 
from (4.2) 


loge 12a log X + logrA 12a 12 


L(a, b) = ———— = ————————————_—————_ —>—s —- = 
logr(abc) (1l3a+1)logX + logy 13a + 1 13+ 1/a 


as X tends to infinity. 
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12 12 12 
es Sy ea ey 
13+ 1/a 14’ 13 
6 12 


and hence [7, 73] CL’. 
The last step of the proof invokes in a similar way a polynomial identity of degree 48. 
On the other hand a result of M. Filaseta and S. Konyagin ({FK]) gives some information 
on larger limit points: 


Consequently 


Theorem 4.3 Foreverye &€ [0, 1) there is a limit point of the set L belonging to the interval 


3 : : ts are : 3 
Cowes str]. In particular there is such a limit point in the interval [1, 5]. 


It is not known if any point of the interval (38, 32] belongs to £’. Nevertheless the 


following conditional result holds. 
Theorem 4.4 ({[BFGS]). The abc-conjecture implies that L' = 5, 1]. 


4.2 Limit Points for Modified and Generalized abc-conjectures 


1. In the case of the n-conjecture for the ring Z of integers there is a limit point of the set 
L belonging to the interval [2n — 5, 00], n = 3,4,..., see Theorem 3.4 above. Moreover 
from the results given above and from the identity (3.2) it follows immediately that 


l 36 
—, (2n —5)— Ls > 3 
[= (2n | c n> 


in the case of the n-conjecture. 

Similarly the result of [FK] implies that there is a limit point of the set C belonging to 
the interval [2n — 5, 3n — 2) in the case of the n-conjecture. 

2. In the case of the n-conjecture for the ring of polynomials k{x] over a field k of 


characteristic zero D. Davies [D] proved that 


l 
[=n —5| Gt Ea 
n 


Hence £L’= [5, 1] forn = 3and L’= [Z. 3] forn = 4since forn = 3 and 4 then-conjecture 
is a theorem. 

3. In the case of the strong n-conjecture for the ring Z of integers we know only that 
there is a limit point of the set £ belonging to the interval 3, 00] provided n is odd and to 
[1, oo] if n is even (see Theorem 3.6). 

4. In the case of the strong n-conjecture for the ring k{x,,..., x,;] of polynomials of 
many variables over a field of characteristic zero H.N. Shapiro and G.H. Sparer [SS] proved 


that 
l 
Licj—-,n—-2)}. 
n 


I do not know if this interval can be replaced by a smaller one. 
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5 Numerical Results 


At the end of the paper there is the table of all known triples (a, b, c) of relatively prime 
natural numbers such that L(a, b) > 1.4. 

Now we make a remark on the examples in the table. Let N be the number of triples 
(a, b, c) inthe table satisfying c < 2”. We give below the values of N for 15 <n < 60, 3|n, 
and for n = 80. For n < 36 all such examples are known ([Ka]), thus the corresponding 
values of N are correct. 


logsc < 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 460 80 
N 3. 8 12 18 26 34 51 62 71 78 90 100 107 115 125 129 140 


Let us observe that N behaves like a linear function of log, c (approximately N = 3 
log, c—49, for2'> < ¢ < 2%). Ifit would be true in general, we would have a contradiction 
with the abc-conjecture. 

There are known few triples (a, b, c) withc > 
difficulties of factoring of large numbers. 

On the other hand, no contradiction with the known examples occurs, if we weaken the 
abc-conjecture as follows: 


2° and L > 1.4. This may be caused by 


lim sup L(a, b) < 2, 


where a, b run over all relatively prime natural numbers. From this weak form of the 
abc-conjecture one can also deduce many other theorems and conjectures like discussed 
above. 
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No. 


ee ee eh 


L(a, b,c) 


1.629912 
1.625991 
1.623490 
1.580756 
1.567887 
1.547075 
1.544434 
1.536714 
1.526999 
1.522160 
1.502839 
1.497621 
1.492432 
1.491590 
1.489246 
1.488865 
1.482910 
1.481322 
1.474450 
1.474137 
1.471298 
1.465676 
1.465520 
1.461924 
1.459425 
1.457794 
1.457790 
1.457066 
1.456203 
1.455673 
1.455126 
1.453343 
1.452613 
1.451344 
1.450858 
1.450026 
1.449651 
1.447977 
1.447743 
1.446246 
1.445064 
1.444199 
1.443502 
1.443307 
1.443284 


Table 


Of the extremal abc examples - version of June 1999 


at+b 


2+ 319 . 109 

112 + 32.56.73 

19 - 1307+ 7-297 . 318 

283 4 51! . 132 

123! 

73 4 310 

72.412 ©3113 +1116 . 132 . 79 
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On Values of Linear and Quadratic Forms 
at Integral Points 


S.G. Dani 


The aim of this article is to give an exposition of certain applications of the study of 
the homogeneous space SL(n, R)/SL(n, Z) and the flows on it induced by subgroups of 
SL(n, JR), to problems on values of linear and quadratic forms at integral points. Also, 
some complements to Margulis’s theorem on Oppenheim’s conjecture are proved. 

Let n > 2 and let R” be the n-dimensional euclidean space, viewed as the space of 
n-rowed column vectors with real entries. We denote by Z” the usual lattice in R” con- 
sisting of column vectors with integral entries. We recall that the group GL (n, R) ofn x n 
nonsingular real matrices is a locally compact topological group and the integral matrices 
with determinant +1 form a discrete subgroup, which is denoted by GL(n, Z). The fol- 
lowing fact is basic to many applications. The space of lattices in IR” can be identified 
canonically with the homogeneous space GL(n, R)/GL(n, Z), by identifying the lattice 
eZ", g € GL(n, R), with the coset gGL(n, Z) (we note that any lattice in R” is of the 
form gZ” for some g € GL(n. R)). The correspondence has many interesting properties. 
Firstly, the (intrinsically defined) topology on the space of lattices*:!* coincides, under 
the correspondence as above, with the quotient topology on GL(n, R)/GL(n, Z). If we 
consider the subspace consisting of lattices with a given discriminant (volume of a fun- 
damental parallelopiped for the lattice), then it corresponds to an orbit of SL(n, R), the 
subgroup of GL(n, R) consisting of matrices of determinant 1; the orbit can be realised as 
SL(n, R)/SL(n, Z), SL(n, Z) being the subgroup of SL(n, R) consisting of matrices with 
integer entries. 


1 The Homogeneous Space SL(n, R)/SL(n, Z) 


It will be convenient from this point on to consider SL(n, R)/SL(n, Z), which corres- 
ponds to the space of lattices with a given discriminant, rather than the whole space 
GL(n, R)/GL(n, Z), which is in fact a continuous union of copies of the former, para- 
metrised by the discriminant. We shall denote by £! the space of lattices in R” with dis- 
criminant | (for convenience we shall exclude n from the notation, as there is no possibility 
of confusion on account of it). 

The space SL(n, IR)/SL(n, Z) is noncompact (this follows for instance from the follow- 
ing theorem). However, it carries a finite measure invariant under the action of SL(n, R) 
by translations on the left (cf. [21]). This means that though noncompact it is still a ‘small’ 
space; it has a compact part together with a narrow cusp (end); it may be observed that 
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‘tending to infinity’ in such a space corresponds to going further and further into the cusp. 
We next recall the Mahler criterion which relates the geometric asymptotics in the space 
SL(n, R)/SL(n, Z) with Diophantine asymptotics of lattices?!. 


Theorem 1.1 A sequence {A;} of lattices in L' tends to infinity if and only if there exists a 
sequence {x;} of nonzero vectors in R" such that x; € Aj; for alli and x; ~ Oasi > ow. 


The correspondence is also amenable to analytical techniques. Let f be a measurable 
function on R” vanishing outside a compact set (f need not be continuous). Following 
Siegel we associate to f a function f on L! by setting 


f(A) = Ss F(x) for all A € L!: 


xeEA—(0) 


observe that since f vanishes outside a compact set and A is discrete, the sum on the right 
hand side is in facta finite sum. It turns out that via the transform the Lebesgue integral corre- 
sponds to the integral on £L! = SL(n, R)/SL(n, Z) with respect to the SL(n, R)-invariant 
probability measure (that is, the invariant measure normalised to have total measure 1). 
Specifically we have the following: 


Theorem 1.2 (C.L. Siegel”*). Let f be an integrable function on R" vanishing outside a 
compact subset of R". Then f is integrable and 


[ Fam = [ fat, 


where | is the Lebesgue measure on R" and m is the SL(n, R)-invariant probability measure 
on L}. 


Siegel deduced from this that for any star-shaped body S in R”, centered at the origin, 
with volume less that ¢(n) there exists a lattice A in R” such that SQN A = (QO), astatement 
which was formulated earlier by Minkowski and first proved by Hlawka. The theorem 
plays an important role in the recent results on the asymptotics of the solutions of quadratic 
inequalities, described in §8. 


2 Integral Points in Convex Sets 


It may be recalled that by Minkowski’s theorem any convex symmetric body in R” with 
volume exceeding 2” contains a nonzero point of any lattice A € L! (see [3] or [14)]). 
The result applies in particular to regions defined by inequalities of the form |L;(v)| < aj, 
i= 1,...,k,forsomek > 1, where L;’s are linear forms and a;’s are positive real numbers. 
Similarly, it also applies to sets of the form Q(v) < a where Q is a positive definite quadratic 
form; (one can also consider mixed systems, so long as the volumes of the regions can be 
computed). When k = n and L,..., Ly, are linearly independent (in the vector space of 
linear forms) the regions as above are parallelopipeds and Minkowski’s theorem is optimal 
in this case as, for example, the open cube of side 2 centered at the origin (which has 
volume 2”) does not contain any nonzero integral point. 
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On the other hand for balls (and ellipsoids) centered at 0, the classical theorem of Hermite 
gives a better bound for existence of nonzero lattice points, than the Minkowski theorem for 
general convex symmetric bodies, namely, a ball contains a nonzero point of every lattice in 


L! if its radius exceeds (4/3) "3. This can be deduced from considerations of a fundamental 
domain for the homogeneous space SL(n, R)/SL(n, Z), which we shall discuss next. 

Let D denote the subgroup of SL(n, R) consisting of all diagonal matrices with positive 
entries. We shall write the diagonal matrices in the form diag (d;,..., d,), whered,,..., dy 
are the diagonal entries (in the natural order of the entries). Foro > O let D, denote the 
subset of D consisting of those d = diag (d\,...,d,) for which (d;/d;.1) < o, for all 
i=l,...,n—1. We note that ford = diag (d),...,d,) € Dg, since d}d2,...,dy = | 
we have d” = (dy /d2)"~! (dy/d3)"~* «+» (dn—1/dn) < 0" )/* and hence d, < 0 "—)/?, 
Let N be the subgroup of SL(n, R) consisting of all upper triangular matrices with 1’s on 
the diagonal and for any t > O let N; denote the subset of N consisting of all those matrices 
for which every off-diagonal entry is of absolute value at most t. Also let K denote the 
subgroup of SL(n, R) consisting of orthogonal matrices of determinant 1. A set of the form 
KD,N, = {kdu|k € K,d € Dg,u € N;} 1s called a Siegel set. It can be verified that 
any Siegel set is of finite Haar measure in SL(n, R). It is a crucial fact that for suitable 
values of o and t the corresponding Siegel set is a fundamental domain for SL(n, Z) in 
SL(n, R), namely we have the following. 


Theorem 2.1 (cf. [21], Ch. X). Let F = KD, AN 1/2: Then 
SL(n, R) = F(SL(n, Z)). 


Concerning lattice points this implies that if e; is the column vector in which the first row 
entry is 1 and the others are 0 then the set Fe; = {ge, | g € F} contains a nonzero point 
of any lattice A € L!; (though the argument could be applied to other integral points in 
the place of e;, it does not lead to any useful information when the point is not a multiple 
of e;). Now, Fe, can be seen to be the complement of {0} in the closed ball of radius 
(4/3) ie , namely the maximum possible value for the first entry of any element of D, J 
This establishes the result of Hermite mentioned earlier, that a ball centered at 0 contains 
a nonzero point of any lattice from L! if its radius exceeds (4/ ar for a closed ball 


conclusion holds for radius (4/ a) 4. as well. Since any positive definite quadratic form 
O on R” is equivalent to the quadratic form Qo given by the square of the usual norm 
(namely there exists a g € GL(n, R) such that Q(v) = Qo(gv) forall v € R”) this implies 
the following. 


Corollary 2.2 Let QO bea positive definite quadratic form on R". Then there exists x € Z" 
such that Q(x) < (4/3)"-DPacgyl/", where d(Q) denotes the discriminant of Q. 


3 A Theorem of Howe and Moore 


In this section, we recall a theorem of Howe and Moore, specialised to the case of the 
homogeneous space SL(n, R)/SL(n, Z), and discuss its implications to lattices in R”. 
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As before we shall denote by m the normalised SL(n, R)-invariant measure on CL! = 
SL(n, R)/SL(n, Z). 


Theorem 3.1 (Howe and Moore!>). Let {g;} be a divergent sequence in S L(n, R) (namely 
with no limit point in SL(n, R)). Then for any square-integrable functions f and @ on L! 
we have 


| Figix\o(x) dm — | f dm f @dm, asi —> o. 
L} Ll} Ll} 


This implies the following corollary; it may be mentioned here that the corollary gen- 
eralises a theorem proved in a number-theoretic context by W.M. Schmidt??; the result of 
Schmidt is the particular case of the following corollary when the sequence {g;} is chosen 
to be {diag (i,i,...,i,i-"*!)}. 


Corollary 3.2 Let {g;} be a divergent sequence in SL(n, R). Then for almost every lattice 
A € L! the set of lattices {g; A} is dense in L'. 


For convenience we shall include here a proof of the corollary. Before going over to the 
proof let me however note the following. Inthe corollary ‘almost every’ is meant in the sense 
that the set of lattices A € £! for which the statement does not hold (for a given sequence 
{g;} as inthe hypothesis) is a set of m-measure 0, where m is the SL(n, R)-invariant measure 
on £!. In the set of all lattices (not only from £!) one can also interpret ‘almost every’ to 
mean a lattice generated by n linearly independent vectors vj, ..., vy, which may be chosen 
from a set of full Lebesgue measure. It can be seen that the corollary means also that for 
almost every lattice A in this sense {g; A} is dense in the set of lattices whose discriminant 
coincides with that of A (we shall however not go into the details of this). 


Proof of the Corollary: Firstly we observe that as a consequence of the theorem, for any 
measurable subset E of £! such that m(E) > O we have m(U; g, | (E)) = 1; forif E 
is such a set and f and ¢@ are the characteristic functions of E and Uj g; '(E) respec- 
tively then we have f(gjx)O(x) = f(g;x) for all x € L', for all i. Hence, by the 
theorem (f fdm)({ pdm) = f{ f(gix)dm(x) = { fdm and since [ fdm = m(E) > 0 
this implies that m(U; g, | (E)) = { ddm = 1, as claimed. Now let ($2 j} be a 
countable basis for the topology of £!. Then for all j we have m(2 j) > O and hence 
m(U; g, | (2;)) = 1. PutY =; VU; g) | (Qj). Then m(Y) = 1 and for any A € Y, {g; A} 
is dense in £!. 

The corollary implies in particular that for any closed noncompact subgroup H of 
SL(n, R), for almost all A € L! the H-orbit of A is dense in L!. The theorem of Howe and 
Moore implies further that for such an H a measurable function f on £L! is H-invariant (i.e. 
f(hAA) = f(A) for all h € A) if and only if it is constant almost everywhere. In particular 
all H-invariant measurable sets are of measure O or 1, namely the action is ‘ergodic’. 


4 Duality and Values of Linear Forms 


Let H be aclosed noncompact subgroup of SL(n, R) and consider the action of SL(n, R) 
on the quotient space SL(n, R)/H. By restriction we get an action of SL(n, Z) on 
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SL(n, R)/H. We note that fora g € SL(n, R) the H-orbit of gSL(n, Z) is dense in 
SL(n, R)/SL(n, Z) if and only if the SL(n, Z)-orbit of g~'H is dense in SL(n, R)/H, as 
either of them holds if and only if HgSL(n, Z) is a dense subset of SL(n, IR). Since almost 
all H-orbits are dense (as seen in the previous section) it follows that almost all orbits of 
SL(n, Z) are dense in SL(n, R)/H. 

The correspondence as above, known as duality, can be used together with results about 
orbits of subgroups H of SL(n, R) on SL(n, R)/SL(n, Z) to deduce results about orbits 
of SL(n, Z) under various linear actions which are of significance in terms of Diophantine 
approximation. For many subgroups, especially those generated by unipotent elements, the 
closure of orbits is well-understood, thanks to the work of M. Ratner on Raghunathan’s 
conjecture (see [22] and [7] for details). For the so called horospherical subgroups the 
Raghunathan conjecture was proved earlier. We next recall some results on values of linear 
forms which follow from the orbit behaviour of the horospherical subgroups. 

For | < p < (n — 1) let E, denote the p-fold Cartesian product R” x --- x R". 


An element (v},...,Up)) € E>, is called a Euclidean p-frame if v\,..., Up are linearly 
independent. For f = (vj,...,Up) € Ep we shall denote by (f) the subspace of KR" 
spanned by {v1,..., Up}. The duality argument as above implies in particular the following 


result, first proved in [11]. 


Theorem 4.1 (cf. [11]). Consider the action of SL(n, Z) on Ey given by the component- 
wise action on each copy of R". Let f = (v1, ..., Up) be a Euclidean p-frame. Then the 
SL(n, Z) orbit of f is dense in Ep if and only if ( f) contains no nonzero integral vector 
(namely, (f) OZ" = (0)). 

More generally, if x,,...,xqg € 2", q = 1, are such that 


(Uipsies Ups Xinccestg yi C4 Ais eit) 


(when x;’s are O we get the special case as above) and I" is the subgroup {y € SL(n, Z) | 
y(xj) =x; forall j = 1,...,q}, then the I’-orbit of f is dense in Ep. 


The second assertion in the theorem follows from the proof of Proposition 4.3 in [11]; in 
the statement of the proposition there, the conclusion as above is claimed under a weaker 
hypothesis, which however is incorrect. 

Theorem 4.1 can be interpreted as a result on values of linear forms at integral forms, 
by viewing each vector in R” as the coefficients of a linear form. Let #;, denote the space 
of linear forms on R”, namely the dual space. We consider the action of SL(n, Z) on F,, 
given by (y, L) +> Loy! forall y € SL(n, Z) and L € F,, (the so called contragradient 
action). Via the duality of F, and R” as vector spaces the theorem implies the following: 


Corollary 4.2 Let L},...,L) be linear forms on R", where 1 < p< n—1. Let 
M,,...,M,g, g > 9, be rational linear forms on IR" such that a linear combination of 
the form ey AjLi t+ Ui Hy M ;, where };’s and 1; ’s are reai numbers, is not a rational 
form unless 4; = 0 for alli. Then 


((Lioy,...,Lpoy)|y € SL(n, Z),Mj oy = M; forall j =1,...,q} 


is dense in Fy x --- X Fn (p copies). In particular for any t\,...,tg © IX for which there 
exists x9 € Z" such that M ;(xo) = t; forall] =1,...,q, anyaj,...,@p € Rande > 0 
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there exists x € P(Z") such that 
[Lj (x) — aj| < € and M;(x) =1¢;, foralli =1,...,pandj =1,...,q. 


Similarly, one can also obtain results involving approximation of vectors and forms 
simultaneously. In this respect we note the following (see [4]): 


Theorem 4.3 Let k,l € {1,...,n} be such thatk +1 < n — 1 and let X denote the 
space of (k + 1)-tuples of the form (v,,..., UR, L1,...,L1) where vj € R", Lj € Fy 
and L;(v;) = Oforalli = 1,...,k and j = 1,...,1; (we realise X as a subset of 
R"® x--- x R" x Fy x --- X Fy, k and | copies respectively). Consider the SL(n, Z)- 
action on X, defined componentwise. Then for (v,..., vx, L1,..., Ly) € X the orbit is 
dense in X ifand only ifv,,..., vy and L,,..., L, are linearly independent in the respective 
spaces and the subspaces spanned by them do not contain any nonzero rational elements 
(vector or linear form respectively). 


5 The Oppenheim Conjecture 


We have seen in Section | that study of the space SL(n, R)/SL(n, Z) enables to understand 
integral solutions of Diophantine inequalities of the form Q(x) < a, where Q is any positive 
definite quadratic form. Study of certain flows on the space is applied to similar questions 
in respect of indetinite quadratic forms. In fact in this case the only known complete results 
are arrived at from the study of flows on the homogeneous space as above. 

Let Q be an indefinite quadratic form on R”. Then the sets of the form {v € R” | 
O < |Q(v)| < a} have infinite volume for every a > 0. In the light of this one may 
ask whether they all contain nonzero integral points. For this to hold (for all a > 0) one 
would have to assume however that Q is not a multiple of a form with rational coefficients, 
since otherwise the set of values of Q at integral points would be discrete. Even with this 
additional condition the conclusion does not hold for certain forms in two variables. It 
is well-known that there exist numbers A € R such that g7|A — 2| is bounded below by 
a positive number, as p,q vary over integers, g 4 O (the so called badly approximable 
numbers; see [19], where they are actually called ‘numbers of constant type’) and this 
means that for the quadratic form Q(x), x2) = x2(Ax2 — x) there exists a > O such that 
there is no integral point x with 0 < |Q(x)| < a. There was a conjecture, originating 
from an observation by Oppenheim that for n > 3 a nondegenerate indefinite quadratic 
form Q on RK”, the set of values at integral points is dense in R, whenever Q is not a 
scalar multiple of a form with integer coefficients. The conjecture was proved by Margulis 
in the mid eighties; see [20] for a detailed exposition, including on historical aspects. 
Later, the following relatively stronger assertion was proved in a joint paper of Margulis 
with the present author and subsequently an elementary proof of the same result was given 
in [9]; see also [5] for another elementary proof and [6] for an exposition of the proof of a 
weaker result. 

We recall that an element x € Z” is said to be primitive if +x is not contained in Z" for 


k 
any integer k > 2. We denote the set of all primitive integral elements by P(Z”). 
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Theorem 5.1 Let Q be a nondegenerate indefinite quadratic form on IR", n > 3, which is 
not a multiple of a form with integer coefficients. Then Q(P(Z")) is dense in R, that is, 
foranyt € Rand e > O there exists a primitive integral vector x such that |Q(x) —t| < e. 


The argument of Margulis and that in the subsequent papers referred to above depends 
on the study of orbits of the special orthogonal groups of indefinite quadratic forms, on 
SL(n, R)/SL(n, Z). Let Q be a nondegenerate indefinite quadratic form on R” and let 
H = SO(Q) be the associated special orthogonal group, namely {g € SL(n, R) | O(gv) = 
Q(v) forall v € R”}. Then H is a closed noncompact subgroup of SL(n, R). Hence, 
Corollary 3.2 implies that almost all orbits of H on L! = SL(n, R)/SL(n, Z) are dense. 
In fact we have the following:- 


Theorem 5.2 Let the notation be as above; in particular, we suppose thatn > 3. Then 
every orbit of H on L' is either closed or dense. 


Theorem 5.2 was proved by Margulis and the present author for n = 3 and applied 
to deduce Theorem 5.1 (for all n > 3). Fora general n > 3, it is a consequence of a 
celebrated theorem of M. Ratner, proving a conjecture of Raghunathan (see [22] and [7] 
for some details and references). Theorem 5.2 also implies the following generalisation of 
Theorem 5.1, noted by A. Borel and G. Prasad. 


Theorem 5.3 (Borel and Prasad!). Let Q be a nondegenerate indefinite quadratic form 
on IR", n > 3, which is not a multiple of a form with integer coefficients and let B be 
the corresponding bilinear form (defined by B(v, w) = +(Q(v +w) — Q(v — w)) forall 
v,w € R"). Then forany v},..., Vy»—-1 € R”" ande > O there exist x,,...,X,-1 € P(Z") 
such that 

|B(v;, vj) = B(x;, x;)| <<, forall l,J = l, ee fo 1. 


In respect of deduction of Theorem 5.3 from Theorem 5.2 it may be worthwhile to note 
the following. Let Q be a nondegenerate indefinite quadratic form on KR”. Let Qo bea 
rational quadratic form and let g € SL(n, R) be such that for a suitable A € R (depending 
on the discriminants of Q and Qo), Q(v) = AQo(gv) for all v € R”; such Qo and g exist 
and in fact Qo can be chosen to be one of the standard quadratic forms. Then H-orbit of 
gSL(n, Z) is closed L' if and only if and Q is a multiple of a form with integer coefficients 
(see [9]). In the alternative case, namely when the orbit is dense, the conclusion as in 
Theorem 5.3 follows from a continuity argument. 

It may be mentioned here that Borel and Prasad! consider also similar problems in the 
context of nonarchimedian valuations (in the place of the usual absolute value); we shall 
however not go into the details. 


6 Complements to the Oppenheim Conjecture 


We shall now discuss analogous questions about values of quadratic forms at integral points, 
in some cases complementary to the results on Oppenheim conjecture. 

Let us first consider the nondegeneracy condition in the Oppenheim conjecture, as in 
Theorem 5.1. Though it has been traditional to assume nondegeneracy in the context of the 
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problem, unlike in some other problems the general case here does not readily reduce to 
proving the result in the case of nondegenerate forms (the reader may convince himself of 
this by perusing the cases arising in the proof of Theorem 6.1 below). 

Let Q be a (possibly degenerate) quadratic form. We recall that the radical of Q is the 
subspace W = {w € R” | O(v+tw) = Q(v)forallv € R” andt € R}. Suppose 
that W is a rational subspace of R” (namely, it is defined by linear equations with rational 
coefficients). Then the image (Z” + W)/W of Z” in R”/W isa lattice and R” /W can be 
realised as R™, where m is the codimension of W in R”, with (Z” + W)/W corresponding 
toZ™. Hence Q factors toa quadratic form, say Q’,on R™ and O(Z") = Q’(Z""). Further, 
it can be seen that there exists a positive integer r such that O’(P(Z")) C Q(P(Z")) C 
: O'(P(Z"™)). This shows in particular that if the radical of a quadratic form Q is a rational 
subspace of codimension 2 then the study of its values reduces to the 2-variable case; (see 
below for a discussion on the 2-variables case). We now note the following:- 


Theorem 6.1 Let Q be a quadratic form on R", n > 3, which is indefinite (that is, there 
exist v, w € R"” such that O(v) < 0 < Q(w)) and not a multiple of a form with rational 
coefficients. Then at least one of the following conditions holds: - 


1) the radical of Q is a rational subspace of codimension 2; or 
lil) O(P(Z")) is dense in R. 


Proof: If the rank of Q is at least 3 then the argument as in the proof of Theorem 1 in [8] 
(reducing the number of variables to 3, in which case Q would be nondegenerate) shows, 
together with Theorem 5.1, that condition (ii) holds. Note that since the form is indefinite, 
the rank is at least 2. Now suppose that Q is of rank 2. Then there exist linear forms L 
and L2 on R” such that O(v) = L}(v)L2(v) for all v € R”. Since Q is indefinite L; and 
L» are linearly independent (as linear forms). Suppose first that no linear combination of 
the form A; L; + A2L2, with A;, Az € R, is a rational linear form. Let 7, be the space of 
linear forms on R”, and consider the contragradient action of SL(n, Z) on Fy as in §4. 
Hence, by Corollary 4.2 for L;, Lz as above the set of pairs {(L; (x), L2(x)) | x € P(Z")} 
is dense in R*. Since Q = LL this implies that O(P(Z")) is dense in KR. 

Next suppose that there exist 41, 42 € Rsuch that 4;L; + A2L2 = Lo, say, is a rational 
linear form. If the span of L; and L2 contains a rational form which is not a multiple 
of Lo then the radical of Q, which is given by {v € R” | Ly(v) = Lo2(v) = O}, Is 
a rational subspace of codimension 2, so condition (1) is satisfied. We may, therefore, 
suppose that any rational form in the span of L; and L2 is a multiple of Lo. Let Io be 
the subgroup {y € SL(n, Z) | Loo y = Lo}. Since Q is not a multiple of a form with 
integral coefficients, at least one of Ly and L2 is not a multiple of a linear form with 
integral coefficients; for definiteness assume that this holds for L;. Then in view of our 
assumption as above, any rational linear form in the span of L; and Lo is a multiple of 
Lo. Now let t € Lo(P(Z")), t 4 0. Then by Corollary 4.2 the set of pairs {(L1(x), ¢) | 
x € P(Z"), Lo(x) = t} is dense in {(s,t) | s € R}. We note that Az  O, since Lo is 
a rational form while L, is not a multiple of any rational form. Now, for any x € R”, 
O(x) = Ly(x)L2(x) = Ly(x)(Lo(*) — 41 L£1(x))/A2. Hence, the closure of Q(P(Z")) 
contains the set {s(t — A,s)/A2 | s € R}, which is the whole of R if A; = O and contains 
[—17 /4|Ay Aol, t? /4|A1A2|] if Ay 4 O (it extends to infinity on one side, depending on the 
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sign of A; /A2). Since Lo(P(Z")) is unbounded this means that Q(P(Z")) is dense in R, 
thus completing the proof. CL) 


We now consider the case with n = 2. It was noted earlier that there exist binary 
quadratic forms which are not multiples of forms with integer coefficients but nevertheless 
their values at integer points do not intersect some interval of the form (0,6) for some 
6 > 0. Now, let Q be any indefinite binary quadratic form. Then Q(v, w) can be expressed 
as (av + bw)(cv + dw) for all v, w € KR (with respect to the standard basis), where 
a,b,c,d € Rand ad — bc #0. Anelementary argument then shows that for every € > 0 
there exist x, y € Z, not bothO, such that |Q(x, y)| < €, ifand only if either a/b orc/d is not 
badly approximable. The question of taking values arbitrarily close to nonzero real numbers 
is however more intricate (see [2]). It is closely related to the dynamics of the flow induced 
by the one parameter subgroup of diagonal matrices on the space SL(2, R)/SL(2, Z). We 
shall, however, not go into the details of this here. 


7 Simultaneous Solution of Quadratic and Linear Inequalities 


In the earlier sections, we considered inequalities involving linear and quadratic forms 
separately. In this section we shall briefly consider mixed systems involving a quadratic 
and a linear form, on R>. The following result can be deduced from Corollary 2 in [10], 
using rotational symmetry, with respect to the maximal compact subgroup of the special 
orthogonal group. 


Theorem 7.1 Let Q be a nondegenerate indefinite quadratic form on R? and L bea linear 
form on R>. Suppose that the plane {v € R | L(v) = O} ts tangential to the (double) cone 
{v € R° | Q(v) = 0} and that no linear combination (with real coefficients) of Q and Le 
is a rational quadratic form. Then for anya,b € Rande > 0 there exists x € P(Z) such 
that 

|O(x) — al <€ and |L(x) —)b| <e. 


The result is obtained by proving that for a unipotent one-parameter subgroup {u;} such 
that u; — J is of rank 2 for t 4 0, U7 being the identity matrix) the closure of any orbit of 
the flow induced by it on SZ(3, R)/SL(3, Z) 1s an orbit of a closed subgroup of SL(3, R); 
this is a particular case of Raghunathan’s conjecture, which was subsequently proved by 
M. Ratner (see [22] and [7] for details). 


Remark 7.2 In the notation of Theorem 7.1 it can be seen that when some linear combi- 
nation of Q and L? is a rational quadratic form then the (system of) inequalities as in the 
conclusion cannot admit nonzero integral solutions for alla, b € Rande > 0; (there could 
however exist solutions for a = b = 0, for all € > O). 


It turns out that the condition in the hypothesis of Theorem 7.1 that the plane {v € R? | 
L(v) = 0} be tangential to the cone {v € R | QO(v) = O} cannot be weakened to the 
intersection of the two being nonzero; namely, intersection in a pair of straight lines may 
not suffice. Specifically we note the following result; the main idea of the proof is due to 
Margulis (oral communication) and the author would like to thank him for pointing it out. 
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Theorem 7.3 Let Qo and Lo be the forms (quadratic and linear respectively) on R> defined 
by Qo(x1e1 + x2e2 + x3e3) = 2x1x3 - i and Lo(x,e, + x2e2 + x3e3) = x2 for all 
X1,X2,x3 € R, e}, e2, e3 being the standard basis of R°. Then there exist g € SL(3, R) 
and € > 0 such that the following conditions are satisfied for the forms Q and L defined 
by O(v) = Qo(gv) and L(v) = Lo(gv) forallv € R°: 


i) no linear combination of Q and L? (with real coefficients) is a rational quadratic 
form; and 

11) there does not exist any nonzero integral point x € Z such that |O(x)| < € and 
|L(x)| <€. 


Further, the set of g € SL(3, R) for which the conditions are satisfied (for some € > 0) 
is of Hausdorff dimension 8, namely, same as the manifold dimension of SL(3, R); in 
particular there are uncountably many pairs of forms (Q, L) arising as above for which 
conditions (i) and (ii) are satisfied. 


Proof: For t € R let d; denote the diagonal matrix diag (e~', 1, e’) and let D be the one- 
parameter subgroup {d,}. Let B denote the set of lattices A in L! such that the D-orbit of 
A is bounded (has compact closure) in £'. By a theorem of Kleinbock and Margulis!’ the 
set {g € SL(3, R) | 2D € B} intersects every nonempty subset of SL(3, R) in a set of 
Hausdorff dimension 8; in particular it is not contained in any countable union of proper 
algebraic subvarieties of SL(3, RR); (we recall that an ‘algebraic subvariety’ is a set of the 
form {g € SL(3, R) | P(g) = 0}, where P is a polynomial function on SL(3, R) in the 
coordinate variables). 

We first show that if A € 8G then there exists € > 0 such that A contains no nonzero 
point p with |Qo(p)| < € and |Lo(p)| < €; this implies that if g € SZ(3, R) is such that 
A = gZ then condition (ii) as in the conclusion of the theorem is satisfied for the forms 
Q and L defined by Q(v) = Qo(gv) and L(v) = Lo(gv) forall v € R’. 

Let A € B and suppose, if possible, that for all « > 0 there exists p € A, p # 0, 
such that |Lo(p)| < € and |Qo(p)| < «” (the square in the latter inequality is chosen for 
computational convenience). Consider any € > O small enough so that the open ball of 
radius 3€ centered at 0 contains no nonzero point of A and let p be a point satisfying the 
above conditions. If p = x,;e, + x2e2 + x3e3, where e}, e2, e3 1s the standard basis, then 
we have |x| < € and |2x);x3 — bd < €”. The conditions imply also that |x}x3| < e*. Let 
a = max {|x1], |x3|}. In view of the condition on € as above it follows that @ is positive and 
coincides with only one of |x;| or |x3|. Now let t = olog (a@/e), where o = 1 ifa = |x]| 
ando = —lifa = |x3|. Thend,;p = e'x,;e; + x2e2 + e' x3e3. If ~ = |x| then we have 
le~'x | = € and |e’x3| = |x,x3|/e < € and similarly, if ~@ = |x3| we get that |e’x3| = € 
and |e~‘x;| < €. Varying € along a sequence converging to 0 we see from this that there 
exist a sequence of lattices {A ;} of the form d; A and points p; € A; such that p; — Oas 
j — ©. By Theorem 1.1 this means that {d, A} is not contained in a compact subset of £!, 
which contradicts A being from 8B. The contradiction shows that the assertion formulated 
above must be true and hence condition (ii) holds for any g € SL(3, JR) such that gD E B. 

To find a g such that condition (i) is also satisfied we proceed as follows: We note that 
for any rational quadratic form R the set of g € SL(3, R) for which the form defined by 
v +» R(g~'v) (for all v € R>) is a linear combination of the forms Qo and Li is an 
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algebraic subvariety of SL(3, R). Since {g € SL(3, R) | 2D € B} is not contained in a 
countable union of algebraic varieties, we get that there exists a g € SL(3, R) such that 
gZ € Band no linear combination of Qo and Le is of the form v +> R(g7!v) for any 
rational quadratic form R. The latter condition means that for Q and L defined as in the 
hypothesis of the theorem, for this g, condition (i) is satisfied. This proves the existence of 
ag as claimed. The more general assertion in the theorem is clear from the proof as above. 


O 


Remark 7.4 In the context of Theorem 6.1 one may look for analogues of Theorems 7.1 
and 7.3 for quadratic forms of rank 2. Let Q = L,L2, where L, and L2 are two linearly 
independent linear forms on R? and let W = (ve R | Liv) = L2(v) = O}, the line of 
intersection of the planes defined by L; and L2. Let L be a linear form such that L(w) = 0 
for all w € W (this corresponds to the tangential intersection condition in Theorem 7.1). 
Then ZL is a linear combination of L; and L2 and hence |Q(v)| and |Z(v)| can be small 
simultaneously only if |Z;(v)| and |Z2(v)| are small. This shows that for small a@ and b the 
inequalities |Q(x) — a| < € and |L(x) — b| < € admit integral solutions for all « > 0 only 
if no linear combination of L; and L2 is a rational form. An argument as in the proof of 
Theorem 6.1 shows that the latter condition is also sufficient to ensure that the inequalities 
admit primitive integral solutions, for alla, b € Rand e > Q. (It may be noted that the 
condition is not equivalent to the condition in Theorem 7.1 that no linear combination of 
Q and L? is rational). 

On the other hand, corresponding to Theorem 7.3 one can show by an argument analogous 
to the one there, using the theorem of Kleinbock and Margulis, that there exist linearly 
independent linear forms L;, Lz and L suchthat for somee > Othereisnox € T= {O} such 
that |Z; (x)L2(x)| < e€ and |L(x)| < €, and no linear combination of L; and L2 is rational 
(one can also additionally arrange so that no linear combination of L; L2 and L? is rational). 


8 Quantitative Results 


Apart from the problems of existence of solutions of Diophantine inequalities, results on the 
modular homogeneous space and the flows on it can be applied to get quantitative results on 
the number of solutions of such inequalities, involving quadratic forms; (for linear forms the 
corresponding statements follow from results on uniform distribution of sequences of the 
form {i(v1,..., Un)} mod 1, for fixed v},..., v, € R, - see [2] - and will not be considered 
here). The classification of invariant measures of flows induced by unipotent one-parameter 
subgroups, due to M. Ratner, plays an important role in this respect. It is applied to obtain 
results on the distribution of the flows which, in turn, together with Theorem 1.2 are applied 
to deduce the asymptotics of the number of integral solutions of the inequalities, in balls 
centered at 0, as the radius tends to infinity. We mention here the following result in this 
regard; the purpose being only to give an idea of the results, we shall not strive for generality 
or completeness (see [12] or [20] for details). 


Theorem 8.1 (Eskin, Margulis and Mozes!*). Let QO bea nondegenerate indefinite qua- 
dratic form on IR", n > 5, which is not a multiple of a rational form. Let B(r) denote the 
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ball of radius r in IR" centered at 0. Then there exists a constant X such that for any open 
interval (a, b) in R, 


atte €E B(r)NZ" |a < O(x) <b} — A(b—-a) asr — ow. 

The constant A in the theorem also turns out to be such that for any open interval (a, b) 
the volumes of {uv € B(r) | a < Q(v) < b} are asymptotic to A(b — a)r"~*, so the result 
signifies that the number of solutions of the inequalities in question is asymptotic to the 
volumes of the regions they describe in R”. 

It turns out that similar asymptotics do not hold in general for n = 3 or 4. In these cases 
there are lower estimates comparable to the one above and upper estimates which are higher 
by a factor of logr (see [20] for details). 


9 Comments in Conclusion 


The study of flows on SL(n, R)/SL(n, Z), and other more general homogeneous spaces of 
Lie groups, is also applied to various other Diophantine problems. The reader is referred 
to [7] for an exposition of the general theory of flows on homogeneous spaces as well as 
various applications. We mention here a only couple of recent applications of the theory 
to certain Diophantine problems. In [13], the authors study the asymptotics of the number 
of integral points on certain subvarieties, within distance r of the origin, as r — oo. In 
particular they obtain the following result, which may be of independent interest. 


Theorem 9.1 (Eskin, Mozes and Shah!?), Let P be amonic polynomial with integral coef- 
ficients which is irreducible over the rationals and has degree at least two. Forr > 0 let N, 
be the number of integral n x n matrices X = (x;;) with Xj, x7. < r*, whose characteristic 


n(n—1)/2 


polynomial is P. Then there exists a constant c > 0 such that N; ~ cr asr > &. 


Interesting results have also been proved recently on ‘approximability properties’ for 
linear forms and vectors in [18] and [16]. In the former, a conjecture of Sprindzuk is 
verified and in the latter the theory of badly approximable and singular systems of linear 
forms is generalised. 
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Variants of the Second Borel-Cantelli Lemma 
and their Applications in Metric Number Theory 


Glyn Harman 


§1 Introduction 


In Metric Number Theory we are concerned with the arithmetical properties of almost all 
numbers (in the first case with respect to Lebesgue measure on the real line, but the ideas 
generalise to R* and other situations). Investigations therefore involve both analysis and 
number theory. It is the purpose of this paper to review the important contribution made 
by variants of the second Borel-Cantelli Lemma (also known as the divergence part of 
the Borel-Cantelli Lemma). While doing this we shall prove sharper variants than have 
previously appeared and give their applications. To illustrate the types of result which fall 
within the ambit of our approach we state the following:- 


Theorem 1 Let y(n) be a monotonic decreasing function with w(n) € (0, 4), and write 
P(a, N) for the number of solutions to 


lop -q\ < W(p), p.,qprime, p<N, (1.1) 
and put 
W(N)=2)— wipe 
pen og P 
Then, if ¥ (co) diverges, we have, for almost alla > 0, 
P(a, N P(a, N 
lim wee >1, and lim oe < (1.2) 
N>co W(N) N->co W(N) 


Remarks: In [8] (see also Chapter 6 in [11]), the author showed that if (oc) diverges 
then P(a, N) — coas N — ov foralmost all positive ~. The quantitative results (1.2) are 
new. One would expect that, for almost all a > 0, 


P(a, N) 
Wi: = 
N->co W(N) 


exists and equals 1, but this result appears to be out of reach at present unless one puts 
certain growth conditions on W(N). 
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In §2, we shall prove the Borel-Cantelli lemmas and two important variants. In §3 we 
consider the zero-one laws which often hold for problems in metric number theory. We 
then apply the results to Khintchine and Duffin-Schaeffer type problems in §4, and to two 
restricted variable problems (like Theorem 1 above) in §5. We then consider a different 
variant of the second Borel-Cantelli lemma in §6 which provides an asymptotic formula for 
the number of solutions to Diophantine inequalities. We end in §7 with open problems and 
other generalizations of the ideas. 


§2 The Borel-Cantelli Lemmas 


Let (2, M, 4) be a probability space, and suppose that A,, Az, ...is an infinite sequence 
of subsets (events) in M. Write 


Fy (a, An) = |{n < Nia e€ Ayj}|. 


Lemma 1 (First Borel-Cantelli Lemma). If 


> An) (2.1) 
n=] 


converges, then Fn(a, A,) ts finite for almost alla € 2.as N > ow. 


Proof: Clearly the set on which Fy (a, A,;,) — oo has measure not exceeding 
OO 
\_ [L(An) (2.2) 
n=M 


for every M. From the convergence of (2.1) we deduce that (2.2) can be made as small as 
possible by taking M suitably large. This completes the proof. C) 


Lemma 2 (Second Borel-Cantelli Lemma). Suppose that (2.1) diverges, and that 


Ay NA) = WA) WAR) for j #k. (2.3) 
Then 


Fu(a, An) —> © as N—-o foralmostall aée. (2.4) 


Remarks: The condition (2.3) is often called independence. It is trivial to observe that 
the divergence of (2.1) is not sufficient to obtain (2.4). For example, if 2 = [0,1), A= 
(O, 1/n), and jz(-) denotes Lebesgue measure, then (2.1) diverges, yet (2.4) is not true for 
any a. 


Proof: Let x ;(@) be the characteristic function of Aj;, and write 


pa (xj(@) — wA,)) 


fu(a@) = 
" wee U(Aj) 
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Since 
. Xj (a)du = w(A;), 
and, for 7 # k, 
[ xj (a) xe (a)du = w(Aj ON Ax) = w(A;) CAR), 
we have 


via MANU mA) (2 
| fu(a)dp = a < |) H(A) 
2 (i u(Aj)) ia 
Since (2.1) diverges this gives 
lim | fu(a)’du = 0. 
N->00 Jc 
Hence, by Fatou’s lemma, 


liminf fy(@) =O for almost all a 
N-©oo 


Thus F 
;=1 Xj(@) 
lim sup Sajnt Ki) = 1 for almost all a, 
N->00 =) (Aj) 
from which (2.4) follows. CJ 


The second Borel-Cantelli lemma is of limited use in metric number theory since (2.3) 
only holds for certzin types of problems. The following variant, which is a simple conse- 
quence of Cauchy’s inequality, has been much more useful, however. The basic idea goes 
back to a lemma of Paley and Zygmund!®. 


Lemma 3 Suppose that (2.1) diverges. Then the set D on which (2.4) holds satisfies 
pL(d) > c, where 


2 —|} 
N N 
c = lim sup y- (Aj) > (Aj Ax) : (2.5) 
lead om | jk=l 
Proof: Write 


n 


An =|) Aj. Mim.n) = Do uWAj), Vinny = SO w(AjN Ap): 


j=m j=m j.k=m 
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We take x; (a) as in Lemma 2, and let Xm,,(@) be the characteristic function of A’. To 
prove the lemma it suffices to show that 


M(0, N)? 
lim (lim p(Aj,)) = lim sup on) (2.6) 
m—->oo n-oo 


Noo V(0,N) — 
Since (2.1) diverges, we have 
M(m,n) = M(0,n)+ O(m), V(m,n) = V(0O,n) + O(m). 
Hence, since w(.A’_) is a non-decreasing function of n, we need only show that 


M(m,n)j? 


U(A,,) = 7 


(2.7) 


This is a simple consequence of the Cauchy-Schwarz inequality as we now show. We have 


mom.n)= [7 xe) da =f rmnla) YY xj(@) dee 
j=m 


j=m 


Thus, by the Cauchy-Schwarz inequality, 


n 
M(m,n)* < (/ Xm.n(Q) aa) | » Xj(X) Xn (a) daw 
Q 2 stom 
= p(A;,)V(m, n). 
This gives (2.7) and completes the proof. C 


The above lemma usually gives the stronger conclusion that ~(D) = 1, since the sets 
dealt with frequently satisfy a zero-one law, that is ~(D) can only be 0 or 1. Hence, we 
only require c > O in order to establish that 1(D) = 1. Even when such a law is not known 
a priori it may still be possible to reach the same conclusion as we show in §4. 

Sullivan’* appears to have been the first to state that the hypotheses of Lemma 3 lead 
to a quantitative conclusion, although, with different notation, we note that the result is 
essentially due to Cassels (Lemma 1 in [3]), following Paley and Zygmund!®. We now give 
a strengthened version of Sullivan’s result. 


Theorem 2 Given the hypotheses of Lemma 3, write, for € > 0, 


F n 
C= {etm sup “A > 1 el 


N>co V(N) ~ 
as ut : Fn (a, An) 
J = {c+ tim ine “Ie el+ef. 


where 


N 
V(N) =) H(Aj). 
j=l 
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Then 


2 is 
w(E) > ce“, pw(J)> ae (2.8) 


Remark: Note that the second statement of (2.8) is independent of c. 
Proof: Let Fy be the set on which 
Fy (a, An) = (1—€)V(N), 


and write Gy for its complement in £2. We then have 


N N N 
V(N) = | do Xn(@)du = | D_ Xn(a)dp + ; > Xn(a)d yp. 
2 n=! GN p=| FN n=l 


Thus 


N 
V(N) - [ Y> xn(w)dy 


N n=] 


V(N) — n(Gn)C — €)V(N) 2 €V(N). 


N 
| YS) xn(a)dyu 
Fn n=l 


IV 


So, by Cauchy’s inequality, 
mFu) D> AiO Ax) = uF) | Yo xs@)xe(@)du > 7 V7(N). 

1<j.k<N 2 <j k<N 

Hence, by (2.5), there is a sequence ny — O such that 
(Fn) = ce* — nn, 
for infinitely many N, say Nj, N2,.... Write F; = Fy,. Then 
So uFINFD SN YO WFD, Yo uD = ce?N + 0(N), 
1<j,k<N l<n<N l<n<N 


and so 
9) —| 


N 
lim sup (>. Mp) 3 WF; NF;) > ce’. 


NCO” N 54 1<j.k<N 
It follows from Lemma 3 that 


u({a:ae ae for infinitely many j}) > ce’, 


which establishes the first statement in (2.8). 
Now let 11y be the set on which 


Fu(a, An) < +e)V(N), 
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and write Zy for its complement in 22. Then 


(1 +e)V(N)u(Zy) < | CRE ne | Fy (a, And = VON). 
§2 


In 
Hence u(Hny) = 1— wn) > 1- is == eae It is now straightforward, as above, to 
deduce that the set of a belonging to infinitely many Hy also has measure at least aoe 


One might wonder whether a stronger conclusion than (2.8) were obtainable with the 
given hypotheses, say with the liminfin Jy bounded below by aconstant. That this is not the 
case is shown clearly by the following example. Take 22 = {0, 1}, w({O}) = wd lp = 5, 


_ fay if 27 <n < 20+!” with j even 
An — re) : Bi) 
{0} if 27 <n < 20+!” with j odd. 


Then 
N : N 2 
l 
lim su A A; NA > = 
eee S- u iD a L(A; k) =4 
j=l j.k=1 
for all N, w(F) = uw) = 1, yet for all a 
jin ee a (2.9) 
N>co  V(N) 


There is no benefit imposing the condition 4(A,) — 0. We could take 22 = [0, 1), n(-) 
as Lebesgue measure, and 


r l or 1 
Avetin U (Fa 5+ aa): 


where 
_ | (0, 5) if 2° < V(n) < 20+)? with j even 
"Lelia if 27 < van) < 20+)? with j odd 
2 


Then, writing @(n) for Euler’s function, we have 


p(n) Gln) 6 
(An) =>, Da ~ —ylogN, 


n=] 


(2.5) holds with c > 0, w(E) = uw(7) = 1, but (2.9) also holds for almost all wa. Of course, 
it is the intersection with Z, which makes the problem unnatural in Number Theory. If 
we removed that we would get an asymptotic formula for Fy (a, A,) for almost all a (see 
Theorem 4.4 in [11]). 
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393 Zero-One Laws 


We now consider how the conclusion of positive measure can be converted to a result of full 
measure. One approach is to define a suitable mapping T such that DT = D, where D is 
the set on which our required property holds. If T is ergodic, that is the only invariant sets 
have measure 0 or 1, then we have achieved our goal (see Section 2.2 of [11] for example). 
Another approach is to apply the results of the previous section to short intervals after a 
suitable scaling. In this way we obtain in place of (2.8): 


w(E) > ce?, wIJ)= eh: (3.1) 


Note that the first statement remains unchanged, but one should think of c now as being of 
size c’ (92). If we are dealing with Lebesgue measure then the following consequence of 


the Lebesgue density theorem is crucial. Throughout this section z(-) denotes Lebesgue 
measure on R. 


Lemma 4 Let K bea given subset of IR such that 
WK NZ) > dp(Z) (3.2) 


for every finite subinterval T of IR, where 6 is a positive constant. Then almost all real a 
belong to K. 


Proof: Write C = R\K and suppose that w(K) > O. Then, by the Lebesgue density 
theorem, XK has a point of metric density, say a, that is to say 


uU((a —€,a+e)NC) > 2e(1 — 6/2) (3.3) 
for all sufficiently small «. Let Z = (a — €,a — €). Then, by (3.2) and (3.3) we have 


5 5 
WZ) = wENK) + w(ZNC) = du(Z) + wn) (1 7 5) = (1 = 5 (ZL). 


This contradiction establishes 4(C) = 0 as required. C] 


In the above we could have replaced R by any subinterval 7 of R, and the conclusion would 
have been that almost alla € 7 belong to K. Combining Lemma 4 with Theorem 2 having 
replaced (2.8) by (3.1) then gives the following result. 


Theorem 3 Let J beasubinterval of R, and suppose that we have, for any finite subinterval 
LCs 


lim vim» u(ZN An) = AZ) (3.4) 


Noo V(N) 
with V(N) as Theorem 2 and V(o) eas and 
9) —] 


N 
lim sup (> UL(Ay nD) SZ (Aj 1 Ax NT) > cr(Z) > 0. (3.5) 


N->0O \n=l 1<j,k<N 
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Then, for almostalla € J we have 


F(a, A . . . Fy(a, A 
lim pee > 1, and lim inf ENG An) < (3.6) 
N-+co V(N) N-co V(N) 
In particular, given € > 0, for almost all a € J there are infinitely many N with 
Fn (a, An) 
1 — ———_——. < ] , 3.7 
E< Vin) <I+e (3.7) 


The example given at the end of §2 can be modified to show that the hypotheses of 
Theorem 3 do not imply 


> 0. (3.8) 
In this case we take 


. B,; if2@)° < vn) <2@/+" 
"Lc; if 22D" < van) < 22542", 


where 


k k+1 k k+1 
B= Ula ar) &= U [a ar): 
k odd k even 


Again, the intersection with Z, is unnatural in number-theoretic problems, but this example 
shows that some other information is needed to obtain (3.8). 


§4 Application to Khintchine and Duffin-Schaeffer Type Problems 


In 1924, Khintchine!® proved the following result:- 
Let y(n) be a function such that nw (n) is decreasing and 


OO 
>| ¥(n) = 00. (4.1) 
n=! 
Then, for almost all a, there exist infinitely many rationals a/q which satisfy 
c Bs eel (4.2) 
q q 


In this situation we can take $2 = [0, 1), and our first thought would be to put 


n—| 
_ a_ yw) a, wl) 
a =U(; n ao n i 


In fact we run into serious trouble with this choice for A,, and we need to impose the extra 
condition (a,n) = | (or else our overlap estimates become too large). This gives 


N N 
- y (n)d(n) 
DM An) = 2 (4.3) 


n=1 n=] 
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Now, if ny(n) is decreasing, then the divergence of (4.1) implies the divergence of (4.3) by 
partial summation. The crucial feature is the consideration of the overlaps A,, % A,. To 
be precise, we must investigate 


N 
dD MAN AD =2 YY MAPA) + DY) WAn). (4.4) 


1<j.k<N 1<j<k<N n=] 


The second term on the right of (4.4) is of smaller order than the other terms in view of the 
divergence of (4.1). Also 


(k) woe 
Swany s2 YP Beye. 
1<j<k<N 1<j<k<N 
Where * indicates the summation conditions:- 
laj —bk| <kwW(j), VG, b =(a,k)=1, lxa<k,l<b</yj. 


Since aj — ie = 0 has no solution with the additional constraints, and aj — bk = h has no 


solution if GH y has any factor in common with j or k, and (j, k) solutions in a otherwise, 
we abuine 
Wy (k) 
Yo WANA) <4 YE (= 7 ) ew) 
1<j<k<N l<j<k<N 
: N wanyoin) | 
n)p(n 
< 4 n < 16 ———— 
= «(Ziven) s16(p: See 


for all large N. Since it is not difficult to prove a zero-one law for this situation, this proves 
Khintchine’s result in view of Lemma 3. 

The above proof is essentially due to Duffin and Schaeffer? who observed that the hypo- 
thesis ‘nw(n) decreasing’ could be replaced with 


—1 


lim sup (> fide) (> win ») > 0, 


N—oo n=l 


and the conclusion (4.2) holds with (a,q) = 1. It is, in fact, the repeat solutions b/r = 
a/q that lead to problems in estimating the overlaps, and these are excluded by imposing 
(a,n) = 1in.A,. Duffin and Schaeffer made the following conjecture which is an important 
unsolved problem in this field. 


Conjecture: Let y(n) € [0, 1) be any function such that 


s w(n)o(n) _ 
n=] " 7 


Then there are infinitely many solutions to 


laa—q\|<wq), (a,g)=1 


for almost all real a. 
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See Chapters 2 and 3 of [11] for the known cases of this conjecture. 
Now let us suppose that we want to establish, for almost all real a, a result of the form 


S(a, N) 
lim sup >C 
N->oo W(N) 


(4.5) 


where 
N 
W(N) =25° win), 
n=] 


and S(a, NV) denotes the number of solutions to (4.2) with gq < N. If we assume that y(n) 
is Monotonic decreasing, or that 


< ad) <c) forallm,nwithn <m < 2n, (4.6) 


y(n) 


for positive constants c1, C2, then it is possible to use Theorem 3 as we now show. 
First we replace (a,n) = 1 in the definition of A, by (a,n) < D (this idea is due to 
W.M. Schmidt [22]). This gives, if 7 = (B, B + 4), 


aC n) 


Sit NI) =6 Type) (1+ 0(1)) as N —> 00, 
n=1 n=1 
where 
oDn= D1. 
d<N 
(d.n)<D 
Also, 
LAC w(M) 
Ss L(Am 1 An MD) s Ki min (AEP a ae ye l, (4.7) 
l<m<n<n a,b,m,n 
where M and R satisfy 
M=(1+6), R=(1+6)', M<RK<N, 
and a,b, m,n are restricted by 
lam — bn| < Kz max(W(R)M, wW(M)R) = H, say, (4.8) 


and 
(m,b) < D, (n,a)< D, m:M,n:R, B:: BM, a:: BR, 


where x : X indicates X < x < (1+ 4)X, while x :: X stands for X < x < (14+ 5)*X 
Now we can replace (4.8) by 


am' —b'n=h, 1 <|h| < H’, (4.9) 
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where 
m fe we ; H 


~ (m,b)’ ss (mb) (m, by’ 


since the terms with |am — bn| = 0 contribute to (4.7) 


N 
<D? > wn). 


n=1\ 


For each m, b and h there is exactly one solution to (4.9) in 1 per set of residues (mod m’). 
Hence the number of solutions all told for each pair m, b is no more than 


H (1 * ae) 
(m, b) m 


Summing over m and b thus gives a bound 
5R 
< 36°M7h (1 + a) 
M 


Substituting this into (4.7) gives a bound 


K em (“. ve) (RM + 62M?) max YRYM, W(M)R) 


« SS RMY(R)W(M) + Dav) S* my(m) 


M,.R m<R 
N 2 oN 
« (do 7) +) vn) 
n=] n=] 


provided that y(n) « n7—!. Indeed, we could also have obtained a satisfactory bound if 


g(n) (log nyA~3 «Knw(n) « (logn)4 


for some A > O and some function g(n) — oo asn — oo. Under these conditions we 
obtain, for almost all a, by Theorem 3 that 


; S(a, D, N) 
lim sup ————-——_ > 1, 
N>co V(D,N) 


with % 
D, 
W(D,N)=25° oe yin), 


n=1 


and where S(a@, D, N) is the number of solutions to 


lan —m| < w(n), n <n, (m,n) < D. 
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Since 
; vex. ECD NN) 
lim hm —— = 
D>ooN->co W(N) 


for many sequences (7), in particular if y(n) is monotonic decreasing, this gives, in many 
cases, the expected lower bound for almost all a: 


In fact, it is possible to take D increasing with n in the above, say D = W(n)4. 

We now consider the weak point in the above proof. We required a bound on y(n) to deal 
with a sum which arose because n might not run over a complete set of residues (mod m) 
(the problem arises from the need to keep our results uniform in 6). We did not make use 
of all the variables in (4.8)/(4.9). We can reduce the number of variables to 3 by recasting 
(4.8) as 

—hn 7 


Here we have assumed (n, m) = 1, written n for the inverse of n (mod m), denoted fractional 
part by {-}, and the conditions on the variables are 


1<|h|< H,m:M,n:R. 


The requirement (m, n) = | is overcome by considering all d|m and restricting attention to 
all n with (n, m) = d. After summing over d this deals with all n. To bound the number of 
solutions to (4.10) first fix m, then by standard techniques (see Lemma 5.1) in [11], writing 
e(x) = exp(2zix), the number of solutions is 


PI 


h=1n:R 


< 38°RH + 65 Ds 


R 
< 36°RH +H (mie a a) 
ld<1 


using the well-known bound for an incomplete Kloostermann sum (see Lemma 3.6 in [14] 
for example). Summing over m then gives a bound 


3697RHM + M2t*©H + RH. 


Since 


y\ MI*eY(R)Y(M) = 0(87(N)) and S> RY(R)W(M) = 0(W(N)), 
M.R 


M.R 


this establishes a suitable bound for the overlap estimates without any restriction on the 
size of w(n) except (4.6), or monotonicity. In this way we obtain more precise results than 
Sullivan (Theorem 4 in [24]) with weaker hypotheses. For example we can easily modify 
the above to obtain the following quantitative form of the Duffin and Schaeffer result. 
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Theorem 4 Suppose that y(n) € (0, 5), and is either monotonic or satisfies (4.6). Also 
suppose that W'(oo) = oo where 


N 
w'(N) =2)° oe y(n). 


n=1 


Then, writing (a, N) for the number of solutions to 
lan—m|<w(n), (m,n)=1,n<N, 


we obtain 
ua, N) 
lim sup ———— > 
N—-co W'(N) 


for almost all a. 


§5 Application to Two Restricted Variable Problems 


In [6-9] (see also Chapter 6 of [11]) the author has considered the problem of showing 
whether infinitely many solutions exist to 


lan—m|<wyw(n), nEeA,meB, 


with A and B given sets of integers. One expects infinitely many solutions for almost all a 
if the sum 

Y= e(nyy(n) 

neA 
diverges, where p(n) is a ‘smooth’ function representing the probability that n € B (for 
example, if B is the set of primes then p(n) = (logn)~'). That convergence of the sum 
gives only finitely many solutions for almost all @ is easy to establish from the first Borel- 
Cantelli lemma. Many of the divergence cases were settled by combining Lemmas 3 and 
4 above. However, the arithmetical information obtained in some of the cases discussed 
is strong enough to apply Theorem 3 instead. One example is given above as Theorem 1. 
The most delicate part of the proof, as expected, is to obtain the correct order size for the 
overlap estimates (A, Ag M7) with p, g primes and 


r—wW(p) r+w(p) 
Ar=U( p -° —p ). 
r#p 


with r also denoting a prime. Here Kloostermann sums together with a three-dimensional 
sieve were needed to bound the number of solutions to the resulting inequalities: 


|lpr —qs| <A, p,q,r,s primes. 


In the same way, several of the results in [9] also can now be restated to give a conclusion 
like (1.2). 

As another application of the general technique we mention the following result which 
can be obtained by combining the present Theorem 3 with the overlap estimates in [10]. 
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Theorem 5 Let a, be an increasing sequence of reals with an, > an + 1,a, > 2. Write 
P(a,n) for the number of primes of the form [ana],n < N, and put 


N 


I 
VIN) = >° eas 


n=l 


Then, if V(co) diverges, for almost all a we have 


P(a,N tae (Oy AN 
lim ig > 1, lim eee < 
Noo V(N) N-»co V(N) 


With some tedious, but straightforward, modifications the method of [12] can be 


combined with Theorem 3 to produce the following:- 


Theorem 6 Letk > 2. Write 1; (a, N) for the number of k-tuplets of primes (pi, ..., Pk) 
such that 


pj =(p§4)], k>j 22. 
Then, for almost alla > 2, we have 


m*(a, N)log* N m*(a, N) log N 
lim sup SOS AS > ed ae lim inf EAB) Sb < goa 
N->00 N N->0o N 


§6 Asymptotic Formula Variants 
In the applications given above it would be expected that instead of inequalities for /imsup 
and liminf we should be able to get a result 


H(a, N) 
im ——— =l, 


or even an explicit error term: 
H(a, N) = V(N) + O(V'(N)) as N —> ov. 


Here H(a, N) counts the number of solutions to an inequality, V(N) is the expected number 
of solutions, and V'(N) = o(V(N)). Weyl*® presented a technique which has been refined 
by subsequent authors? * to give the following. 


Theorem 7 Let {2 be a measure space with measure tt such that0 < w(82) < ow. Let 
f(a) (kK = 1,2,...) be a sequence of non-negative -measurable functions, and let fx, Dx 
be sequences of real numbers such that 


O< fx < de (kK =1,2,...). 


Write 


N 
P(N) =) dx, 


k=] 
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and suppose that P(N) — owas N — oo. Suppose that for arbitrary integers m,n (1 < 


m <n) we have 
2 


[| XO @- fo) mex Y a (6.1) 


m<k<n m<k«n 


for an absolute constant K. Then, given e > 0, for almost alla € {2 we have,as N > ™, 


y ful) = 2 fc + O (oF mndos ow +2)2*6 + max max, fi) (6.2) 


k=1 


Proof: For a detailed proof see Lemma 1.5 in [11]. We sketch the basic ideas here. First 
write 
n; =max{n: P(n) < j}. 


It then suffices to prove (6.2) for N = nj;. Next express j in binary as 
P=) D2", 
Vv 
and write 
ae or seit 2. 7 . us = 
Bj) = (Gis) :i= DT BG, v2", bi, 8) = 1,0<8 <1} 
where r == r(j) = {log, j], and put 


F(i,s,a)= ) > (fel@) — fr), 


uy<k<uy 


with 
u; = u;(i, s) = max{n > 0: ®(n) < (i + 1£)2°}, 


with the convention max 4 = O. Thus 


\— fale) = Yh YS) F(i,s,a). 
k=1 


(i.s)EB(s) 


It thus suffices to show that, for almost all a, 


S> IFC, s,@)| = O(j2 (log(j + 2))2*). 


(.s)EB(s) 
We have = _ 
Y> [FC 8, @)| < |B(A)|2G2(r,@) < r2G2(r, a) 
(i,s)€ B(j) 
with 


Gir,a)= > F'is,@). 
O<s<r 
pal’ 
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The required bound for G(r, a) for almost all a, namely G(r, a) < R2+€9" forr > r(a@) 
may then be deduced from (6.1) with the first Borel-Cantelli lemma since 


ula € 2: G(r, a) > r°+€2"}) 


Me 


— 


r 


converges. This completes the proof. CO 


There are examples which show that the error term in (6.2) can sometimes exceed 
2(N)(log(P(N) +- 2))2 infinitely often (see Chapters One and Five in [11]). To apply 
Theorem 7 to the type of problems considered so far, let f, be the characteristic function 
of A, and put f, = u(Ax). The left hand side of (6.1) is thus 


2 
[ SY) f(a) fj(a)dpu — 2 [ S> fidu Yo fptu(2)| do fk 
m<j,k<n m<k«<n m<j<n m<k«<n 
2 
= DO wAINAD-| DO wAd (6.3) 


ms<j,k<n m<k«<n 


assuming that 4({2) = 1. We thus see that to obtain an asymptotic formula we require 
much more than 


N p) 
(oki 1(Ab)) 
lim sup ———————__——_ > 
N00 Li<j.k<n MAM Ax) 
we actually require 


Z 


MAINA =] DO aA] +O DO de]. 


m<j,k<n m<k<n m<k«n 
where 
1 3 
$2(N)(log(@(N) + 2))2*€ =o (> HA) , 
k=] 


Theorem 7 has had many applications in metric number theory. Often it is easier to work 
directly with the left hand side of (6.1) rather than the expression in (6.3). We give three 
examples by way of illustration, pointing out the distinctive features of their proofs.. 


Theorem 8 (W.M. Schmidt)**. Let y(n) be a monotonic decreasing function with y(n) € 
(0, 5). Write 


N 
W(N) = >> wn), 
n=] 
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and suppose that VY (oo) diverges. Then the number of solutions to 
lan—m|<w(n), n<N (6.4) 


is, for almost all a, 
1 
2W(N) + O(W2 (log(W(N) + 2))°**). (6.5) 
Proof: For full details see [22], or Theorem 4.1 in [11]. One very important ingredient in 
the proof is to impose the condition (m,n) < D in (6.4) (as we did in §4). Now D is taken 
to vary with n, to be precise D = W*(n) + 1. It can then be shown that the number of 


solutions excluded by this condition is relatively small for almost all wa. Also (compare §4) 
the difference between W(N) and 


> vn) (D(n), n)/n 
n<N 
is not larger than the error term in (6.5). It then remains to show that the sets A;, A, overlap 
in the way expected on average. In §4 the worst case estimates were taken, but these can 
be improved for monotonic y(n) to give the required result. The crucial feature is that the 
length of the overlap is (j > k) 
; Gas wij) . Wik) 7) 
MO) Se ee ee 
J J k jk 
when 
[ju—kvl =h, |h| < jwtk) +kw(y), Gv) < DY), (ku) S Dk). (6.6) 


Since one only needs an upper bound on the overlap (as (6.3) cannot be negative), except 
for h = O one removes the conditions (j, v) < D(j), (k, u) < D(k), and thereby we obtain 
the required estimate. The fact that D increases with n only worsens the power of log in 
the error term from 3 to 2. C 


Theorem 9 (W. Philipp)*®. Let T be the transformation associated with the continued 
fraction expansion of a real number, namely 


_ 40 if a=0 
= | {a~!} otherwise. 


Let I, be an arbitrary sequence of intervals contained in [0O, 1). Write A(N, a) for the 
number of solutions to T"a € I,,1 <n < N, and put 


®(N) = D> w(Tn) 


n<N 


(I,) = l / dx 
pe Toe Le ey 


where 


Then, given € > 0, for almost all a, we have 


A(N, a) = W(N) + O(W2(N)(log(W(N) + 2))2**). 
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Remark: Philipp actually proved this result for other transformations of number-theoretic 
interest with differing choices for 4: appropriate to each transformation. The above theorem 
has an interesting corollary in providing information on blocks of numbers among consecu- 
tive partial quotients in the continued fraction expansion for almost all @ (see [13]). 


Proof: See [20] for details. As usual we need to estimate certain overlaps. If we put 
E, = T"T,, it is shown that, forn > m, 


(En 0 Em) < w(En) (Em) + O(W(En)qv -™) 


for some g < 1. Since 


we” = 01) 


m<n 


we obtain Theorem 9 from Theorem 7. C) 


Theorem 10 (G. Harman)!®:!! Let a, > 2 be a lacunary sequence, and write 


N 
Sv(@) = [lean] =pin<N}I, WN)=> > 


n=] 


logan 


Suppose that ¥(N) — coas N — oo. Then, for almost alla > 0 and for any € > 0 we 
have 
3 
Sy (a) = W(N) + O(W(N)? (log(W(N) + 2))2**). 


Remark: If we take a, = 10” this has an interesting corollary for primes appearing in the 
decimal expansions of almost all positive a. 


Proof: Consider the interval [A, A + 1], A > 0, with 


A, =ta,a+unU|? pet). 
Pp 


An An 
It is relatively straightforward to obtain 


» (Am O An) 


M<m<«<n<N 


3 


1 ] 
7 2. . es log apn 


a 
MeneN (log a,) (log am) M<n<N 


using known results on primes in short intervals together with the convergence of 


Sees 


= 
| log’ an 
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For the overlap sum with a > Ap it suffices to obtain a rough upper bound for 


> H(Am 1 An) 


M<m«<n<N 


3 


using a sieve method. C 


§7 Further Applications 


Most of the results given above generalize to BR fork > 1. For example, the present 
author’s research student Huw Jones!° has recently obtained the following generalization 
of Theorem |. 


Theorem 11 Let yj(n), j = 1,...,k be monotonic decreasing functions with wj(n) € 
(0, x)s and write P(a, N) (@ = (a,...,@x)) for the number of solutions to 

|paj Eqj|<Wj(p), p,q; primes, p <N. (7.1) 
Also, put 


WN) = 2k > Wile): +- Walp) 


rr 


Then, if Y (oo) diverges, we have, for almost all a € R*, 


P(a, N) ae 
lim sup ———— > 1, liminf ——— <l. (7.2) 
N>co W(N) N-co W(N) 

Several authors have generalized Khintchine’s theorem to other situations. In many 
of these generalizations (for example [1], [19], [23], [24], [25]) the reals with Lebesgue 
measure are replaced with some group on which an appropriate measure is defined, while the 
role played by the integers 1s now taken by a discrete subgroup (or the réle of the rationals ts 
taken by the orbit of some element under a group action). The result we stated as Lemma 3 
features in these works, but the approach to the overlap estimates is more geometrical. 

Work on submanifolds of R* has mainly concentrated on proving the convergence/finitely 
many solutions result which for the whole of R* is very easy, but is far from non-trivial for 
submanifolds (see Chapter 9 of [11]). We would seem some way from proving a complete 
Khintchine type result in this situation, although Khintchine himself proved one result 
[17], and some authors have shown that one can obtain an asymptotic formula result in the 
divergence case for certain manifolds?"*. 

An outstanding problem in the area we have discussed is to obtain asymptotic formulae 
for those problems where only lim sup and lim inf results are known. Indeed, it would be an 
advance just to obtain a non-trivial upper bound for lim sup and a non-trivial lower bound 
for lim inf for almost all a € §2. In view of the examples we gave above it would seem that 
more arithmetical information must be fed into the method. 

Perhaps the major problem, however, is the stubborn Duffin and Schaeffer conjecture. If 
we attack it by Lemma 3 we are lead to certain overlap estimates, but we seem unable to 
obtain satisfactory bounds in the most general case (see the discussion in Chapter 2 of [11]). 
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Pythagorean Triples 
Edmund Hlawka 


Historical Introduction 


The investigation of Pythagorian triples has a very long history. For the first hundred years 
I refer to the famous book [DICO1]. Triangles of this type were given by Greek and Indian 
mathematicians. Arithmetically these are the solutions of the diophantine equation 


in rational numbers. The general solution is given by the formulas 


l(m? —n7) 
1. 2mn 
z = Im? +n?) 1(1 £0), m,n arbitrary. 


These formulas are already contained in the works of Euclid and Brahmegupta. We also 
mention Bhascara, Pisano, Vieta,! Euler and Kronecker. 

In early days the case |x — y| = I has been studied (example: x = 3, y = 4,z = 5). 
This leads to the Pell-Fermat equation 


x? —277 = +1. 


It has been treated in the antiquity but is still of interest (In [RUNO1] further references can 
be found. The paper [PARO1], unfortunately, was not accessible in the original version). 
In this paper we will refer to further historic articles. 
The intimate connection with the right-angled triangles and also with the unit circle given 
by 
1—?? 2t m 


a ee n= sinw = 


= COs SS = 
5 e 1+ 12 n 


is well-known. 

The first important progress, namely that = is irrational, was made by Scherrer and 
Hadwiger. I refer to my article [HLAO1L]. This article contains new results which were 
presented in a lecture on Dec 9, 1977. 


IF. Vieta, “Genesis triangolarum”. 
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Consider the Gaussian plain of numbersa = A+ Bi, wherei = /—land A, B are integers. 
Here the norm N(a) = A? + B? satisfies N(a) > 1. The pair (A, B) is considered to be 
the point with the integer coordinates A and B. The set of these points forms a lattice with 
the unit sqare forming its fundamental area. This lattice is invariant under transformations 
T of the type 


x’ = Ex+Fy 
y = Gx+Hy, 
where E, F, G, H are integers with determinant |. The corresponding inverse transforma- 
tion 1s given by 
Hx’ — Fy’ 
y = —Gx'+ Ey’. 


Together with a we consider its conjugate a = A — iB and define 


a A+iB 
’ et) ae ee ’ 
Mae =e (1) 
where € is one of the four different powers of i: 1, i, i? = —l,i? = -i. 
The set of all p(a, €) forms a group with respect to multiplication with unit E = p(1, 1) 
and inverse p(a,€)~' = p(a@,é). We call this group the pythagorian group P. All its 


members have norm 1. We call all p(a,€) with p(a, 1) associated. If we choose one of 
them we write p(a). Note that among these numbers 


a=A+iB, ai =—B+Ai, ai? =—-(A+Bi), aii =B-—Ai (2) 
resp. among the numbers 
@=A—-iB, @i=B+Ai, @i*=—-A+iB, ai? =—B-—Ai 


exactly one has positive real and imaginary part which we call the principal number of these 
four numbers. Here we may assume A > B, otherwise exchange A and B. 
Multiplication of numerator and denominator in the fraction (1) with a, yields 


p(a) = X(a) + 1Y(q@), (3) 
with 
He, Wi (4) 
a) = ——., Sa 
A2 oa B2 o A2 re B2 
Putting 


x=X(~), y=Y(a), Z7=pl(@)=x+ly, (5) 
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we get 
2za=x? ty? =1, 
1.e. (x, y) is on the unit circle and the triple 
(A* — B*, 2AB, A* + B*) = (a,b,c) 


fulfills the diophantine equation 
a* +b? =c’. 


We have 
z+ 1 
z= ; 
z+1 
ifz+10,hence x +10. 
Putting 
ee ae 
x+tl A’ 


we get for A #0 


l+it  (1+it)? 


Z=xX+1y= = : 
ane Ory aaa ee: 
hence 
1-1? 2t 
xX = > a Sra 
14+ r2 7 1+ 2 
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(6) 


(7) 


(8) 


(9) 


(10) 


(11) 


(12) 


the well known parameter representation of the unit circle, where the point (—1, 0) is 


deleted. 


The parameter representation in (4) is given in the homogeneous parameters (A, B) 


containing the point (—1, 0) corresponding to A = 0, B = 1. 


In (7) we suppose all multiples (Aa, Ab, Ac), A & O integer, to be contained. 


In the author’s paper [HLAO1!] we have gone the converse direction (here we refer to it 
briefly as “Pythagoraische Tripel I” or “PT I’, while for the paper in front of you we write 
“PT II’). PT II can be read independently. Some results from PT I will be quoted and proofs 


will be sketched. 
We want to describe the group P explicitly: If 


a) =A, + Bil, a2 = A2+ Bol, 
then 


a3 = aja72 = Az +183, 


A3 = AjA2— B,B2, Bz = A2B) + A,B 


P(a3) = P(a,) P(az2). 


(13) 
(14) 
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Hence the triple (7) 1s given by 
(a3, b3, c3) 
with 
a3 = AX — B3, bz =2A3B3, c3 = AX + BZ = (At + B?)(A5 + BS). 


Ifa = A+ Bi, then (ZL integer) 
of = Ap + Bri. 


Let first L denote natural number, then 


L 
A, +iB, =(A+iB)- => (;)a aay. (15) 


k=0 


We distinguish the case k = 2r even (to get A,,) and the case k = 2r + 1 odd (to get By): 


L 
AL on » (5, crate 
r 


L 
B se | r Al—2r-l part 16 
L i a ) (16) 
Cr = (A74+B?)*, 


Here r takes all nonnegative integers with r < 5L. 
For negative L = —m, let A, = Am, BL = -- Bm, hence 


oa” = Ay +iB, = Am —iBn = a", 
In P we introduce a further operation,” which we denote by 9, 
P(ao) = p(a1) ® pla). 
Fora; = Aj +iB;, j = 1, 2, define 
Ago = Aj A2, Bo = B, Bo. 

For all 7 = 0, 1, 2 we put 

N; = Ai +B7, Dj =Aj— By, Lj =2A;B;. 
We get . 

p(aj) =X; +1Y; = eo 

and No = |ao|7 = Ab + Be. An easy computation shows 


| 
No = pee + D, D2). (0) 


2 which seems not to occur in the literature explicitly. 
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Furthermore we have 


a = Aa — Be + 2iA9Bo = Do + iLo. 


and 
Do = Ae + D2N\), Lo = 5 bila, 
hence 
a a | 
p(w) =— = ao Xo + iYo. 
a {aol 
Since mn A 
x= Po y= 
No No 
(0) and (00) yield 
X1 + x2 y1y2 
Xo = 
1+ x 1x2 1+ x 1x2 


and therefore 


R 


I I 
Im(p(a1) ® p(a)) = —egPO) EPlOr) 


We want to bring out a consequence: 


X?+yY¥?=1, 


Yj] =f 1-X}. 
[y _ y2 /y — y2 


1+ X,X2 


holds for all i, since 


Hence (18’) gets 


yielding 
Xe=1-(1-X4)=1- 
' : 0) (1 + X1X2)? 


We want to write (18’) in a different form: Define 


I 
yV(X) = ———. 


then (18”) can be written as 


vy (Xo) = y(X1)yv (X2)C + X1 X2). 


1 + Re(p(a1)) Re(p(a@2))’ 


1+ Re(p(a1)) Re(p(a2)) 


(1 — X7)(1 — X5) 
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(00) 


(17) 


(18) 


(18") 


(18”) 


(19) 


(20) 
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Now we involve geometric aspects. There is a unique angle gp with O < g < 27, such 
that for all g of the type gp + 27k (k takes all integer values, we write g = go (mod 27)) 


p(a) =e'? = cosy +ising, 


hence 
A? — Br, 2AB 
cosg = ————, 1 sINg = ——;. 
a ple Pe = ae pe 
More explicitly we write 
~ = arc p(a@). 
Note 
arc(p(a@1a2)) = arc p(a@,) + arc p(a2) (mod 277) (21) 
arc(p(a)) = Larc p(a@) (mod 277). (22) 
Again we have 
ee A? — B? x - 248 
AZ + B?’ AZ + Be 
Considering the determinants 
xX | xX i 
n=ly i] B=[y =| 
1 1 1 | 
= (04) t= | 5 4) 
and their cross ratio we get 
. 2 
T, TI» A+iB 
CR(T, T3; T2,T) = —: —- = 
(1), 73; 12, T4) rT, (3) 


hence (due to Laguerre) 
l 
aan log CR(T), T3; Tz, Ts). 


This interpretation is often useful for applications of pythagorean triples. 
We put 


fo = TX. 


The Swiss Mathematicians Scherrer and Hadwiger have proved: 
Ifa =A+iB, AB £Oand A2 £ B’, then x is irrational. 


31SCHO1], [HADO1]. 
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Proof: * Assume, by contradiction, that x = 7 is rational. Then 
a imm 
ne 
a 

and 

(2)" _— iam __ +] 
a 
Since 
@ _ AL+iB, 
a> Ay —iBy’ 
we get 


Ap +iB, = X(A, —iB_). (*) 


We may assume that A, and B, have no common prime divisors. Let us make the 
assumption that 


L =1 (mod 2), A=HI1(mod 2), B =O (mod 2). (+#) 
Consider first the case of negative sign in (*), thus 
A, = 0. 


Considering (16) modulo 2 leads to a contradiction, since with the exception of A (which 
is odd by assumption (**)) all terms have the even divisor B. 

Suppose now that L is even: 
Let 2? the maximal power of 2 in L, i.e. 


L =2°L, 
L, odd. We put 6 = a’, then 
(=)'=(§) 
a? AB 
Let 
Xl — p 
B 


Maintaining all other assumptions from (**), we see from the preceeding argument that x1 
and, since x; = 2? y, x are irrational. Now cancel condition (*)! 
Assume that A and B have no common prime divisors. It suffices to treat the case 


A = B = 1 (mod 2). (***) 


Take w? instead of a. The corresponding triple is (a2, b2, c2)a2 = A* — B?, bo = 2AB, 
C2 = (A? + B7)*. (&*) implies 


A2 = 0 (mod 4), Bz =0 (mod 2), 


4The proof presented here follows the book [MESO1]. 
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thus 
1 | 
A3 = m2 =0(mod 2), B= 5 Ba = AB = | (mod 2). 
Thus the triple (B3, A3, C3) fulfills our conditions and the proof is complete. CJ 


Consider the sequence w = (2kx). Since 2y is irrational, a result of the theory of 
uniform distribution (confer the textbooks [HLA02] and [KUIO1]) tells us that the sequence 
is uniformly distributed modulo 1. 

Now we compute the discrepancy Dy of the sequence. By Erd6s-Turan’s inequality 
we have 


M 
1 i 
Dy <Cc{l— Y> —|W h 
nN < mt 2. thy n(h)| 


Here Wy (h) is the Weyl sum 
ptt 
_ 2nihk x 
Wn(h) = — 2. e ; 


We recall the easy computation. Note 


] 


| )| |1 — e27ihx| | sin? why | 
and 
Bh 
sin why = Ch’ 


with C = A? + B? ist. 
By is an integer and, since x is irrational, we have By, 4 O. Thus we get 


l 
|lsin thx| > Bh 
and we have shown that 
[Wv(h)] <C". 
The choice M = ee + 1 leads to 
20 log C 
og 


We will use this formula several times. 
If f is integrable in the Riemann sense and with a period 1, then 


N 

1 I l 

~— > FRx) mf #@)de 4 oD) )). 
k=1 0 
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where |¥| < 1 and o(e, f) denotes the integrability module e of f. If f has bounded 
variation V(f), then we can give the more concrete estimation 


N 1 
] 
In(f=— Dy flex) = | f(x)dx + OV(f)DN. 
k=1 


If f has the form G(cos 27x, sin 27 x) (G integrable in E *) then, provided that G(cos x, 
sin x) has bounded variation V(G), we get 


I 
An(G) = | G(cos27x,sin27x)dx + 0V(f)Dn, 
0 


where 
l 
Aw(G) = = ) | Glan, ba) 

with 
ae An — By, bee 2A2L Bor 
2 L? dar) L 
Aj, + By, A>, + By, 

and 


AS, + BZ, = (A* + B*)*E. 


For differentiable G we have 


aG\* (aG\’ 
V(G) < Max Sa ab ae |e 
E2 Ox dy 


§2 


It is useful to transfer the uniformly distributed sequences considered above to the higher 
dimensional case. Consider a prime of the type p = 4k4+ 1 = A? + B? where the 
representation as a sum of two integer squares is unique up to the sign and order of A and 
B. Using complex numbers we have the representation 


p = (A+ iB)(A — iB) = 1(p)- 1(p), 


where 7(p), 1 (p) are different primes in the number field Z(i). We have 


AP) gin, cD 
(Pp) 
where x is irrational and the sequence (kx ) is uniformly distributed mod 1. 
If pi, p2,..., Ps are pairwise different primes of the type 4k + 1 with corresponding 
(71, 71),..., (Ws, Hs) and angles x1,..., Xs, then these angles are linearly independent in 


the sense of uniform distribution. The reason is that the prime decomposition in Z(i) is 
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unique. A consequence is that the sequence (kx 1, ..., kx 5) is uniformly distributed mod 1 
in E*. In the author’s paper ([HLAO3]) the proof of the estimation 


(loglog N)* 
DS,(pi.-.-s Ps) < 4°Cs(log Py) (2) 
log N 
for P; = p,-...-+ ps can be found. 
In 
pj = (Aj +iBj)(Aj —iBj) = A’ + Be (3) 


the numbers A; and B; are given by (this representation is due to Jacobsthal) 


Pj > ae 
a, = AY (4) (A+) és 
Ho NPI Pj 
k2 . 
Bj = (4 -) tu), (5) 


where the r; are the quadratic residues modulo pj, and u; the quadratic non-residues 
modulo p;. For r; one may independently take the residue = —1. For the non-residues 
one has to look among the numbers 1, ..., 5 J Pj (confer [VINO1)). 

There are many results on the A; and B;, even recent investigations. For instance, the 
case that also A; is a prime is treated in [FOUO!]. 

As an example we give 


pi =5, my =14+2V-1, po = 13, m2 =34+2V-1, pp =17, m3 =4vV-1. 


§3 Applications 


Applications to formulas of relativity theory. Consider the famous energy formula 
MQ 


Expanding of (1 — v”)~* in terms of powers of v and taking only two terms one gets 


b= 


1 
(ei 5, 


and 
l 4 
Taking for v the value 
2 2 
Ay — By 


—>——. (7) 
Ai +B? 


UL = 
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The theorem of additivity §1 (17) is 


vp +v 
m 


we get 
l AL Br, 
By = 5mo (FE + 4) (8) 
d for the i l = —Zor 
and for the impulse p Toe 
se te oe Bh Be m (9) 
ae Vl—v2 2\Br AL is 
We have 
E} — pz =m). (10) 


The difference 


shows the error. 
Now we consider the change of the mass 


1 
Min=one (= Z ) 
1— v2 


of two particles of the same mass coming together. For v = vy, we get 


({A| — |B)? 
A Sj 11 
Lmo 0 AB (11) 


Computing for instance 
1 
* 
ee 
L=1 


under the assumption that vz lies in the interval J :0 <a < B < 1, we get from the theory 
of uniform distribution 


+ error = arcsin 6 — arcsina@ + error 


[ dv 
a J/1 — v2 
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and 
1 
W De *ad- v7)! = arctg y — arctg6 + error, 
L=1 


where y = arccos B, 6 = arccosa@. 
The error terms can be estimated using the discrepancy. 
Putting v = uv, the Lorentz transformation 


; x—vut ‘i vux+t (12) 
xX = = =_ 
V1 — v2 V1 — v2 
gets to 
; 1/AL Br, 1 (AL Br, 
Xp = eho t+—)xet+ertoe- 
oa. Ay I By Ap 
1f/Ac Br 1fAc Bt 
o= -{—-— —~{—+—]r. 13 
7 ($F 7) +5 (H+) oo 
Important points 
1 (AL Br, ; 1 (AL By 
XxX = _ —_—_— -—_ ; =C- —_—_ > 
ON Be Ay aa Le eee oe 
lie on the curve x* — t? = c?. 
In the three dimensional case the Lorentz transformation is given by 
i vpl 
Se Ss KO) ee ee 
2 2 
1 — v7 l— vu; 
i t — vz (xo) 
a 
2 
where 
KX = (%1,%2,%3), X = (X41, x2, X3) 
o = (01, 02,03) unit vector 
(xo) = x10) + x202 + x303. 
The case vy, = as ({vz| > 1) (Dirac electron) is also important. We do not go further 
into this direction. 
The sequence of the pairs (vz, v,) fir! = 1,2,... forms a countable model of a ‘Dirac 


sea’ (maybe this is an interpretation of a remark of Hilbert 1934 reported by the physicist 
Sauter). 
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A little application to the general relativity theory: Using the equivalence principle we 
get the Schwarzschild radii r;: 


Application to the triangle: If a right-angled triangle has the legs x, y and the hypotenuse 
z then, by Pytharoras’ theorem, x2 +4 y? — 2? and the triple (A? — B*,2AB, A* + B’) 
satisfies the equation. Frequently z and x are given and y has to be computed. 

Starting with the relation z7 — y* = x? we get the solution 


_1/fA : B _1/A_ B 
OSI Ay 2 ORB 
We get triples of a type treated above. 


The following idea is due to R. Lauffer:° 
Consider an arbitrary triangle with vertices E, F, G arranged as in the picture. 


E e| H €2 F 
Let the coordinates be given by E = (—e1, 0), F = (e2,0), G = (0,h), H = (0,0) and 


the lengths of the edges be a for EG, b for FG. The length of EA ist e1, that of HF is eo. 
Let p, p2 be primes with the corresponding Gaussian primes 7; = A, + i1B,,72 = 


A2+1B>. For 
1 (Ay B 1 ( A2 By 
=-—{—+— jh, b=-|—+—]A, 
(343) (+2) 


1 / Aj By 1 { A2 Bo 
ep=-{[—-—]h, eQ=-(|(—-— fh 
2\B A] 2\ Bo A2 


> Historical remark: Before Lauffer there were several mathematicians treatin g related problems: Brahmagupta, 
Euler, GauB, Blichfeld and, mainly H. Schubert. We again mention Dickson. Unfortunately, Schubert’s original 
papers were not accessible for me. 


we have 
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The total length EF is 


1 (A, Ad B Bo 
= ee 
2\ Bi Bo A; Ad 


We compute the angles a in E, B in F and y in G we get 


At — B ; 2A,B 
COS @ = ~3 5 sIng@ = — 735: 
Aj + B; Aj + B; 
Az — B2 2A0B 
cos B = —+——* <, sin = ——— =, 
Ax + Bs Ax + Bs 


(the formula for sina is due to H. Schubert). 
To compute y we use 
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(14) 


(15) 


cos y = cos(mz(—(a@ + B))) = cos(a + B) = cosa cos f — sina sin B 


and 


siny = sin(a + B) = sinacosf+ cosa sin B, 


which follows from above. Thus the trigonometric functions of the angles are rational 


numbers. If / is a rational number, then the same holds for the edges. 


There are applications of the pythagorean triples to different branches of mathematics: 


rotations 

3-dimensional pythagorean triples 
quadratures 

interpolation on the unit circle 
rational points on spheres 

line geometry 

Clifford surfaces 

spherical and non-euclidean geometry 
infinite series 

partial differential equations 
‘natural geometry’. 


These applications are part of the content of a manuscript which is to be published later on. 
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Integer Points in Plane Regions and Exponential Sums 


M.N. Huxley 


How many integer points (mm, n) lie inside a large circle, or in the annulus between two circles? There are 
approaches by real-variable approximation theory, or by Fourier analysis. The same ideas occur in both, and 
the latest results use both methods at different stages of the argument. 


Analytic number theory is about counting the number of sets of integers satisfying certain 
conditions. There are two famous questions. 


1. The number of prime numbers in a short interval. 
2. The number of integer points inside a circle: (m,n) with 


m+n? < R?. 


In question |, prime numbers are defined negatively by ruling out multiples of smaller prime 
numbers. One asks: how many solutions of 


pq=Nr+h, 


with pa prime number in some range, and h asmall integer? This is a very hard problem. An 
easier version 1s to change from prime numbers to square-free numbers, defined negatively 
by not being multiples of the squares of prime numbers. Now the relation is 


pq =N-+h, pq SN. 
A version with two terms, 
P1 q2 
Pidi — P39Q2=h, —~,/—, 
P2 q\ 


leads to an approximate equation involving rational numbers. 
The two questions can be generalised to two problems. 


Problem A Count the number of integer points inside a closed curve (as an asymptotic 
formula). 


Problem B Count the number of integer points (m,n) close to an arc of acurve y = f(x), 
so that 


In — f(m)| <6. (1) 


Upper or lower bounds are still interesting when there is no asymptotic formula. 

Variants of these problems have integer points (m, n) replaced by rational points, either 
projectively (points (7. >? with g < Q corresponding to an integer point (m,n, q) in 
projective space) or generally (points (4, _ withn < N,q < Q). 
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The tools of analytic number theory are: 


1. Discreteness. An integer n with |n| < 1 is zero. 
2. Cauchy’s inequality. Mean squares are positive. 
3. Fourier analysis, usually Poisson summation. 


Problem A by Fourier analysis leads to exponential sums. 


Problem C Estimate the sum 


M’ 
Y“e(f(m)), e(x) = exp 2mix, M’ < 2M. (2) 
M 


or the two-variable sum 
H' M' 
S° So ethf(m)), H' <2H, M’< 2M. (3) 


h=H m=M 


To compare the problems we suppose that M < m < 2M, with f(m) = TF(4;) in 
Problems A and C (T has the dimensions of area), and f(m) = NF (47) in Problem B 
(N has the dimensions of length, so MN corresponds to T in Problem A). Here F(x) 
is a bounded function defined on an interval containing [1,2]. Certain derivatives and 
determinants of derivatives will occur in the estimates. These derivatives are assumed to 
be bounded in modulus. 


Standard case. Any derivative in the denominator is bounded away from zero. 


Non-standard case. Some derivative which should be in the denominator becomes very 
small. 


In a non-standard case some quantity is approximately constant. If its value has a good 

rational approximation a/g with qg small, then strange things happen, and the argument 

may fail. If there is no good rational approximation, then the argument can be modified. 
Here are some typical functions F(x). 


Problem A F(x) = V1 — x? (Gauss’s circle problem), F(x) = 1/x (Dirichlet’s divisor 
problem). 


Problem B F(x) = 1/x (prime numbers), F(x) = 1/,/x (square-free numbers). 


Problem C As in Problems A and B, and also F(x) = log x (size of the Riemann zeta- 
function). 


The first non-trivial results in each problem typically show a saving of 1 /3 on some exponent 
in the trivial bound, leading to exponents with denominators 3 or 6 in the final result. 


Circle Problem (Sierpinski 1906). The number of integer solutions of as y? < R? is 


mt R* + O(R73). 
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Divisor problem (integer points inside an hyperbola) (Voronoi 1903). The number of 
positive integer solutions of mn < T is 


T log T+(Qy —)T + O(T' (log T)‘). 
Size of the zeta function (Hardy and Littlewood 1921). 
I 
6 (; + | = O(T'/° log T). 


Integer points close to curves (Vinogradov and van der Corput independently about 
1914-18). 
R= 26M + O((MN)'? (log MN)*). 


Rational points close to a curve (Huxley 1994a). With (m,n) in (1) replaced by (m/q,n/q) 
with highest common factor (m,n,q) = 1 andqg < Q, Mq <m < 2Mq, and 6 replaced 
by 6/Q, for F"(x) #0 


R=O (smo (=) + (MN)IP 0) 


General case (Huxley 2000). With (m,n) in (1) replaced by (m/n,r/q) with highest 
common factors (m,n) = (r,g) = 1,n < M,q < Q and 6 replaced by 5/07, with 
f(x) = T F(x), satisfying F’(x), F(x) and 3F" (x)? — 2F’(x) F(x) all non-zero, 


R= O(8'/4M? + (M?O°T (6M? +. 1))'3(MO7T)*). 


In these results c is some fixed exponent, € is an exponent which can be taken arbitrarily 
small. The constant factor suppressed in the O(.) notation is constructed from the range of 
values of the derivatives of F(x), and from € where present. 

Van der Corput (1920 and later papers) developed an iterative procedure for estimating 
exponential sums. 


Step A First differences (or mean square over short intervals). This step keeps M the same, 
and replaces F(x) by F’(x) (to a first approximation). 


Step B Poisson summation. This step is used if M > VT and F’(x) is non-zero. It keeps 
T the same, reduces M, and replaces F(x) by the complementary function G(y) satisfying 


F'(G'(y)) = y. 


The iteration continues until M is small enough for the sum to be estimated trivially, 
or until the new function that replaces F(x) becomes non-standard. The B step must be 
followed by an A step, but A steps can be iterated, so the iteration has a branched tree 
structure. Each A step introduces an extra variable, summed over a short range. A more 
complicated version of the iteration allows A and B steps with respect to these subsidiary 
variables, or in several variables at once. In practice the iteration is stopped after a few 
Steps, either because error terms from a B step in one variable become large when summed 
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over the other variables, or because large parts of the transformed sums are non-standard. 
Van der Corput’s iteration can now be easily studied in Graham and Kolesnik (1991). 

For exponential sums with (log M)/(log T) very small, Vinogradov’s mean value method 
using high moments gives a better estimate than repeated A steps. Vinogradov’s mean value 
method is a single step that does not lead on to further iteration. In some ways it is analogous 
to taking r-th differences. 

Huxley (1989) discovered an analogous iteration that gives upper bounds for Problem B, 
the number of integer points close to a curve. 


Step A Differencing using the interpolation polynomial over short intervals. This step 
keeps M the same, reduces N, increases 5, and replaces F(x) by F(x) (to a first approxi- 
mation), where r is the order of differencing. 


Step B Interchange x and y. This step is used if M > N and F’(x) is non-zero. It 
interchanges M and N, reducing M and increasing 6. The function F(x) is replaced by its 
inverse function. 


The iteration ends with a trivial estimate, which is O(M) if 6 < 1/2, O(6M) if the 
previous step has made 6 > 1/2. The use of r-th differences makes this iteration more 
powerful than the van der Corput iteration for exponential sums. Recent work (Filaseta 
and Trifonov 1996, Huxley and Sargos 1995) uses essentially an A step followed by a B 
step. The B step is elaborated by focussing on short arcs of the curve. Sargos introduced 
the classification into major and minor arcs. A major arc is a region where there is a good 
rational approximation y = g(x) (with small denominator) to the equation of the curve, 
such that any integer point (m,n) close to this arc of the curve must satisfy n = g(m). 
Other regions of the curve are called minor arcs. Huxley and Sargos (1995) gave a simple 
upper bound in Problem B. 


] 
2 ae OOF 
k= 0 (amin somnas (5) 1), (4) 


provided that F(x) is non-zero and 
A=T/M' <1. 


The first two terms on the right of (4) are the estimate from the minor arcs. The third 
term on the right of (4) is the possible contribution of a single major arc, and the term 1 
covers trivial cases. An object of current research (Filaseta and Trifonov 1996, Huxley 
and Sargos 2000) is to reduce the first term in (4) under further conditions, such as the 
non-vanishing of F’~ (x) as well. 

Lower bounds can be given in Problem B if 6 is not too small (Huxley 1996b). The 
number of integer points found in this way is less than the expected number 26 M, and they 
all lie on major arcs. When (log M)/(log N) is large, the rational approximations y = g(x) 
on the major arcs are constructed by an iterative process involving differencing. At present 
the denominator of approximation has to be a power of two for the iteration to proceed 


Integer Points in Plane Regions and Exponential Sums 161 


for more than two steps. This restriction permits an approximation argument in the 2-adic 
metric, and avoids looking in a short interval for a solution of a congruence to an arbitrary 
modulus qg. The estimate that starts the iteration is 


; 64M? d 
R>c min{ 6M, —— ]} when 6 > ——, (5) 
N /M 


when N > M and F(x) is non-zero and numerically less than 1. Here c and d are explicit 
positive constants constructed from the range of values of F’”(x). The iteration leads to 
similar but weaker lower bounds when the second derivative F(x) is replaced by F“” (x) 
in the condition. 

The van der Corput iteration for exponential sums shows that, for large 6, R is not only 
non-zero, but approximately equal to 26M. The lower bounds are still positive for 6 a little 
smaller than this range. 

For both problems B and C there is a deeper method in the middle range, where (log M)/ 
(log NV) is near | for points close to a curve, or (log M)/(log 7) is near 1/2 for exponential 
sums. For points close to a curve, Swinnerton Dyer’s method uses a fourth moment short 
interval estimate on the minor arcs (Swinnerton-Dyer (1974) considered points on the curve, 
with M = N and 6 = 0, so he had no major arcs). The bounds take the form 


R= O(MN)*/9(log MN)!/'9 + terms in 3) (6) 


for F(x) and F®)(x) non-zero, and N > M. For the latest version, see Huxley (1999). 
Similar arguments with higher moments come up against determinants which do not factor 
completely into linear factors. 

For exponential sums the deeper method is that of Bombieri and Iwaniec (1986), which 
uses high moments of short exponential sums. Each short sum is labelled by a rational 
approximation a/q to 4 f(x), and is transformed by Poisson summation (incorporating 
the finite Fourier transform modulo gq). On major arcs g is small, and the transformed sum 
can be estimated at once. The transformed minor arc sums are estimated in mean 2k-th 
power (k = 41n Bombieri and Iwaniec (1986) and Huxley and Watt (1988), k = 5 in Watt 
(1989) and subsequent papers). After some simplification, the large sieve inequality is used 


to bound a bilinear form 
DD MHBjex - yy), 
hj 


where h is the integer variable introduced by Poisson summation, and 7 indexes the rationals 
a;/qj that label the short sums on the minor arcs. The vectors x) lie in a box A in four- 
dimensional space, and the vectors yY? lie in a box B. The upper bound for the bilinear 
form requires an estimate for the sum 


dD, laneui 
h i 


taken over a neighbourhood of the diagonal in A x A. The size of the neighbourhood is 
determined by the size of the box B. It is sufficient to count the number of pairs of vectors 
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x), x in the given neighbourhood. This is the First Spacing Problem; the Second Spacing 
Problem is the analogue for the vectors y“/). 
The First Spacing Problem involves counting the number of sets of integers h1,... , h2, 
with | | . . 
hye phy = hy yy He + hy 


simultaneously for i = 2, 3/2, 1, 1/2. Vinogradov’s mean value method produces con- 
ditions like this, but with integer exponents and exact equality. Progress with the First 
Spacing Problem was made in Watt (1989) and in Huxley and Kolesnik (1991). 

In the Second Spacing Problem, a pair of vectors y/), y in a neighbourhood of the 
diagonalin B x Bcorresponds to arelation between two minor arcs which we call resonance. 
The four coordinates give four coincidence conditions, whose strengths are measured by 
real numbers A;, A», A3, Ag, each less than unity. The first coincidence condition implies 
the existence of the ‘magic matrix’ M, which has small integer entries and determinant one, 


qk qj 


The factor A; has to be saved in the construction in order to get even a trivial estimate. 

When the magic matrix is fixed, then the second coincidence condition holds on a block 
of consecutive minor arcs, the domain of the magic matrix. The factor Az saved leads to the 
non-trivial results of Bombieri and Iwaniec (1986), Huxley and Watt (1988), Watt (1989) 
and Huxley and Kolesnik (1991). 

The next step is to assume the first and second coincidence conditions (the first, alas, with 
borrowed strength which must be repaid). Among the minor arcs in the domain of the magic 
matrix, those on which the third and fourth coincidence conditions also hold, they give rise 
to integer points close to a certain curve, the resonance curve. More precisely, a resonance 
curve corresponds to a block of U consecutive minor arcs, with U = O(1/Aq). The length 
of the resonance curve grows like U*/*. The result of Huxley (1993) was obtained by 
choosing U so that the resonance curve had bounded length, and thus a bounded number 
of integer points close to it. This gives a saving of a factor Ay * from the third and fourth 
coincidence conditions together. 

Work in progress (Huxley 2001a, 2001 b) uses a bound of the form (6). There is a possible 
saving by a factor A a A‘! > However the typical resonance curve has a cusp, with the 
gradient on one branch tending to infinity, and on the other branch tending to zero. At the 
cusp and the ends of the resonance curve, the conditions for (6) do not hold, and a weaker 
argument must be found. 

Further improvements would come if different resonance curves could be compared, even 
by showing that most resonance curves do not have an integer point very close to the cusp. 

The usual test of exponential sum bounds is the size of the Riemann zeta function 


¢ € + iT ) = O(T" (log T)°). 


Hardy and Littlewood (1921) had 6 = 1/6. Rankin(1955) and Graham and Kolesnik (1991) 
calculated the limit of van der Corput’s iteration in one variable as 6 = 0.164.... Using 
the subsidiary variables in van der Corput’s iteration gives exponents which are smaller, 
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but which have 6 > 0.1618... (Graham and Koiesnik 1991). Bombieri and Iwaniec 
(1986) obtained @ = 0.1607..., and the latest form of their method (Huxley 1993a) has 
0 = 0.156140.... Huxley (2001a) uses (6) to reach 6 = 0.156098 .... The best possible 
estimates in the First and Second Spacing Problems in the Bombieri-Iwaniec method would 
reach 6 = 3/20. 

The Bombieri-Iwaniec method works best when the ratio (log M)/(log T) is close to 1/2. 
Sargos (1995) has a modified form of the method which works well near 2/5. Kolesnik has 
a different construction of resonance curves suited to the case when all magic matrices are 
upper triangular, which happens when the ratio is below 3/7. Huxley and Kolesnik made 
extensive calculations of an iteration based on the Kolesnik resonance curve. Their results 
are summarised in chapter 19.3 of Huxley (1996a). 

There seems to be no analogue of the Bombieri-Iwaniec method to exponential sums 
in two variables involving a function F(x, y), because simultaneous Diophantine approx- 
imations to three second order partial derivatives with the same denominator cannot be as 
accurate as an approximation to one second derivative for a function of one variable. The 
form of the large sieve inequality allows a small saving where the second variable is used 
as a parameter that modifies the sum. A similar argument allows a coefficient a(m) to be 
inserted in the sums (2), where a(m) has some integer period g < M, and it can be used 
to extend the estimates from the zeta function to the Dirichlet L-functions (Huxley and 
Watt 1997). 

The best treatment for exponential sums with (log M)/(log 7) not near 1/2 is to apply 
some A steps of van der Corput’s iteration (or a B step followed by some A steps) until the 
new parameters M and T are in the Bombieri-Iwaniec range. The extra variables from the 
A steps give parameters that modify the main sum, so there is a small extra saving. 

The Bombieri-Iwaniec method does work if there is one function, but two integer variables 
occurring as hf (m1) as in (3) (Iwaniec and Mozzochi 1989) oras f(m+h)— f(m—h) (Heath- 
Brown and Huxley 1990). The First Spacing Problem is different in these constructions, 
but the Second Spacing Problem is almost the same; the main difference is that f(x) in (3) 
corresponds to f’(x) in (2). This leads to an estimate for the generalised circle problem. 
Let C be a closed convex plane curve of area A for which the arc length s is three times 
continuously differentiable with respect to the tangent angle yw (the circle is the case s = y, 
A =). Let D be a plane domain bounded by a copy of C enlarged by a factor M > 2. 
Then the number of integer points (m, n) in D is 


AM? + O(M" (log M)°) 


with 0 = 46/73 = 0.6301... ; the constant of proportionality depends on C but not on the 
orientation of the domain D (Huxley 1993b). In the Dirichlet divisor problem, the number 
of positive integer solutions of mn < M? is 


2M’ log M + (2y — 1)M* + O(M® (log M)°‘) 


with the same 6. The possible improvement in the Second Spacing Problem would give 
0 = 0.6298.... The analogous sums with f(m +h) — f(m — h) in place of hf(m) 
in (3) lead to short interval mean square bounds for exponential sums (Heath-Brown and 
Huxley 1990, Huxley 1994b). 
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For exponential sums in one variable, the form of the underlying function F (x) does not 
matter very much. In two variables the non-standard regions of the double sum depend 
on the shape of F(x, y), for instance on whether the second order terms are elliptic or 
hyperbolic. The functions F(x) which arise from the famous problems satisfy algebraic 
differential equations. For problem B, points close to a curve, it is sometimes possible to 


use this. The relation 
Pi [@ 
p2 Vaq 


q 


arises in the study of square-free numbers in a short interval. The curve F(x) = 1/./x 
has a major arc around the point (1,1) with an approximation y = g(x), an explicit Pade 
approximant rational function. This idea was introduced by Roth (1951). Filaseta and 
Trifonov combined this with a differencing argument to discuss the gaps between square- 
free numbers. Let s;,..., Sy be the square-free numbers in the range 1, ..., M, so that 


wi 0M 
to 


Then the maximum gap satisfies 
Sn — Sn = O(M' (log M)°) 


(Filaseta and Trifonov 1992), and Erd6s’s asymptotic formula 


N-1 


So (Sn41 — Sn)” ~ BIY)M, 
l 


where f(y) is aconstant determined by the exponent y, holds forO < y < 59/16 = 3.6875 
(Huxley 1999). These arguments use the upper bounds in Problem B. 

For the case 6 = 0 of Problem B, points actually on the curve, Bombieri and Pila (1989) 
have upper bounds using intersection theory in algebraic geometry, which are essentially 
best possible for algebraic curves. In general the constant of proportionality in the upper 
bound is shown to exist by a compactness argument, and it cannot be calculated explicitly 
or uniformly, which rules out some applications. For results of the quality of (6), the curve 
must be differentiable a large finite number of times. It seems to be possible to extend 
Bombieri and Pila’s argument to non-zero 6, but the upper bound increases rapidly as 6 
moves away from zero. 

The lower bounds in Huxley (1996b) imply but do not improve the famous result of 
Bambah and Chowla (1947) that the gap between numbers less than N which are sums of 
two squares is O(N!/4). 
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Artin’s Conjecture for Polynomials Over Finite Fields 
Erik Jensen and M. Ram Murty* 


1 Introduction 


A classical conjecture of E. Artin{Ar] predicts that any integera # +1 ora perfect square is 
a primitive root (mod p) for infinitely many primes p. This conjecture is still open. In 1967, 
Hooley[H] proved the conjecture assuming the (as yet) unresolved generalized Riemann 
hypothesis for Dedekind zeta functions of certain number fields. 

In 1983, R. Gupta and M.R. Murty[GM] made the first breakthrough by showing the 
following: given three prime numbers a, b, c, then at least one of the thirteen numbers 


2) ‘ 
fac’, a*b?. a’b, bc, b*c, a’c?, ab°, a>bc®, bc?, a*b*c, arc, ab*c?, abc} 


is a primitive root (mod p) for infinitely many primes p. This result has been further refined 
by R. Gupta, M.R. Murty, and V.K. Murty[GMM] to establish that at least one of the 
seven numbers 


{a, b,c, ab, ab*,a*c, ac’} 


is a primitive root (mod p) for infinitely many primes p. Finally, Heath-Brown[HB] used 
the Chen-[waniec switching and a celebrated 1986 theorem of Bombieri, Friedlander, and 
Iwaniec[BFI] to derive the further refinement that at least one of {a, b, c} is a primitive root 
for infinitely many primes p. The paper by Murty[M1] contains an overview of Artin’s 
conjecture and its analogues for elliptic curves. (See also [M2]}). 

Although any undergraduate student can easily understand Artin’s conjecture, to under- 
stand Hooley’s results requires a strong background in both analytic and algebraic number 
theory, and thus is only possible at the senior or graduate level. The results of Gupta, 
M.R. Murty, V.K. Murty, and Heath-Brown require an even more formidable background 
in advanced sieve theory that is available only to the doctoral student or to the expert working 
in the field. 

The purpose of this paper is twofold. Our first goal is to show that the sieve approach can 
be understood, at least conceptually, by the undergraduate student. Indeed, it was this kind 
of conceptual reasoning that led to the original breakthrough. Once our reasoning is in place, 
it is then just a matter of technical expertise to fine tune the argument to get a refined result. 


*The first author, an undergraduate, would like to thank the second author for giving him the opportunity of 
being a part of this research project. Research of the second author was partially supported by NSERC. 
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Our second goal is to study the analogue of Artin’s conjecture for polynomials mod p. 
More precisely, fix a prime p and let a(x) be a non-constant polynomial which is not equal 
to the square of a polynomial mod p. Are there infinitely many irreducible polynomials 
p(x) such that a(x) generates the residue classes of (F,[x]/(p(x)))*? 

In 1937, Bilharz[B], a student of E. Artin, answered this in the affirmative, by assuming 
the truth of the so-called Riemann hypothesis for curves. Then, in 1948, A. Weil proved 
the Riemann hypothesis for curves as a consequence of his rigorous treatment of algebraic 
geometry (see [L] for a proof). Thus, as it stands, even this “function field analogue” is not 
accessible to the undergraduate student. 

In this paper, we will show that for a(x) = x” +c, Artin’s conjecture for F,[x] can 
be established very easily and almost from first principles. The fact that the junior author 
co-authored this paper is proof enough that it is accessible to the undergraduate. 


2 Sieve Theory and Artin’s Conjecture 


Understanding this section requires a familiarity with quadratic residues. We direct the 
reader to chapter 5, section | of Ireland and Rosen{IR]. 

Suppose that we want to prove that 2 is a primitive root (mod p) for infinitely many 
primes p. First, observe that if (p — 1) = 4q, with qg an odd prime, then 2 is a primitive root 
(mod p). To see this, suppose that 2 has order r. Then r|(p — 1). Hence, r|4q. Therefore, 
r € {1,2,4,q,2q,4q}. Let’s examine the separate cases. Clearly the case r = | is not 
possible. Ifr = 2, we have that p|(2*—1), which implies that p = 3. However, p = 4q+1, 
which is clearly > 13. So, we see that r 4 2. Similarly, if r = 4 we have that p|(2* — 1), 
which implies that p = 3 or p = 5. However, we have seen that p > 13, so it is clear that 
r#4.Ifr =gqorr = 2q, then 


= 
2° = 1 (mod p). 


Hence, 2 is a quadratic residue (mod p). Therefore, 
p = +1 (mod 8). 
However, it 1s easily seen that 
p=4q4+1=4(2k4+1)4+ 1 = 8k +5 =5 (mod 8). 


Hence, r = 4q = (p — 1). It is at present unknown whether there are infinitely many 
primes p such that (p — 1) = 4q, with q prime. Heuristic reasoning[IR, ch. 2, §4] suggests 
that the number of such primes p < x is asymptotic to fog? ~ aS X — 00, for some constant 
c > 0. So, our attempt at a proof stalls here. However, if we continue along this line of 
thought, and use the ideas of sieve theory, we can prove that one of 2, 3, 5 is a primitive root 
for infinitely many primes p. First, though, we will pause to prove a lemma which will be 
useful later on. 


Lemma: A natural number n cannot have more than es prime factors. 
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Proof: Letn = py'p,’--- p;', with py, ..., py distinct primes. Then we see that 


l 
logn = Soa log pj > llog2 


i=] 


and so clearly we have that 


2 logn 
~ log2 


and the lemma is proved. O 


Ramanujan[R] observed that this estimate can be refined to 


logn 
f= 0O|(—— 
log logn 


in the following way: first, note that 


l I 
logn > ) | log Pi = ) | log gi 


i=] i=} 


where 2 = q < q2 < --- iS the sequence of primes. Then, by the elementary Tchebychef 
theorem (see [IR, p. 25]), we obtain 


logn > cllogl 


logn 
log logn’ 


for some constant c > 0. If] < 08 


= [glogn’ then we are done, so assume that / > 
Then, 


log! > loglogn — logloglogn 
which gives 
logn > lloglogn 


as desired. 

The seminal idea of Gupta and Murty was to use sieve theory to produce primes p such 
that (p — 1) has very few prime factors, thereby restricting the possibility for 2 to have 
order < (p — 1) mod p. Indeed, sieve theory provides at least 


CX 


, c>O0 
log? x 


primes p < x such that (p — 1) = 2q or (p -- 1) = 2q1q2, with q, q1, q2 prime and 
(2/p) = (3/p) = (5/p) = —1. Moreover, one can arrange 


1 
q2 > x2 > @q > x6 for some 6 > 0. 
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If (p — 1) = 2q, then as (2/p) = —1, we use the same reasoning as before and easily see 
that 2 has order (p — 1) mod p in this case. If (p — 1) = 2q1q2, then the order of 2 (mod p) 
is either 2g, 2q2, or 2q1q2. If the order of 2 (mod p) is 2q1, then p\(224! — 1) and hence 


p| [|] @f-b. (1) 


a<2x27 
Using Lemma 1, we see that the number of primes p < x satisfying (1) cannot exceed 
St a cay! 
a<2x2~ 


Therefore, we have at least 


CX »: 
_ Ay 1-28 


log? x 


primes p < x such that (p — 1) = 2q or (p — 1) = 2q1q2 where the order of 2 (mod p) is 
either 2g2 or (p — 1). 

Among these primes, let us consider the order of 3 (mod p). If the order is 2q1, then, as 
above, we have that 


Pp I] (3° — 1). 
a<2x2~ 


and the number of such primes cannot exceed 
> log 3 pe log 3 goes 
log 2 log 2 
a<2x27- 
Similarly, the number of primes p for which the order of 5 (mod p) is 2g; cannot exceed 
> log 5 He log5 Ay 1-25, 
log 2 log 2 


fee 
a<2x2 


Therefore, we have at least 
Cx 
— — O( 1-28) 
log” x 
primes p < x such that the order of 2, 3, and 5 (mod p) is one of 2q2 or (p — 1). 
We now show that there are infinitely many primes p such that one of 2, 3, 5 is a primitive 


root mod p. If not, then 2, 3, and 5 generate a subgroup of order 2g2 (mod p). Observe that 


pos x 0.74 


2q2 = < — 
q qi 0-26 
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Notice that if the numbers in the set 
{273°95°, O<a,b,c <x%}, a= 0.247 


are all distinct, they generate a subgroup of size x°%. This is clearly a contradiction, since 
3a = 0.741 => x3” > z. Hence, for some a, b, c and a’, b’, c’, we have that 


22395¢ = 2% 3°'5° (mod p) 


which implies that 
2a—a'3b—b'sc—c’ = | (mod p). 
Thus, p divides the numerator of 
[iy . 83?5ce-1): (2) 


O<|a}, |b], |c|<2x 


Observe that if e;, 1 < i <r are integers then the numerator of 


| 
[or -1 
i=l 


[To - [TL a*. 


ej >0 ej <0 


Thus, for a given triple a, b,c in the product from (2), the numerator can have at most 
O(x*) prime factors. 
The number of such primes is O(x*%), which is Oran) since 4a < |. Hence, one of 


2, 3, or 5 is a primitive root (mod p) for these primes. 


3 Sifting Function for Primitive Roots 


In this section, we will use the Ramanujan sum and Theorem 272 from Hardy and Wright 
[HW, p. 238]. The Ramanujan sum is defined as: 


Theorem 272 states that 


_ HG) 
$(4) 


With these tools at our disposal, we will prove the following lemma: 


ca(J) , where a = (d, j). 
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Lemma: Let G be acyclic group of ordern. Let 


a w(d) 
= [= 5 — 
f(g) = a $(d) x (g) 


d>\ 


ord x =d 
where the inner sum runs over characters x of G which are of order d. Then 


_ Jl, ifg generates G 
IMEI a otherwise. 
Proof: Lete bea generator of G. Let g = e/. Then g generates G if and only if (j, n) = 1. 


Notice that if V(e) = e oe , then Wd is a character of order d. In fact, all of the characters 
of order d are given by wok where (k, d) = land 1 <k <d. Then 


f(g) on) 


|| 
= 
+ 
M 
SIF 
as 
M 
»~ 
5 


Notice that 


We can use this fact to rewrite f(g) more elegantly as 


d 
oe yay ai) 


o(d) 
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Now, we apply Theorem 272 to see that 


g(r) |r wld) HG )O) 


f(g) 
a djn ?(@) oa) 


b(n) | KEOUGH) 
me 2 (745) 
d\n (d,j) 


Notice that 


g =! generates G => (j,n) = 1 => (j,d) =1VadIn. 


So, let g generate G. We now have that 


s e e e e 2 ° e e e e 
Since both yz and ¢ are multiplicative functions, then ra is also a multiplicative function. 
So, when g generates G, we have 


f(g) = een N(i+—) 


pin P pin Pp 
_ p-| P 
= I] p I] p—| 
pin p\n 


So, when g generates G, f(g) = 1. 
Now, let’s consider the case when g does not generate G. So, let (j,n) = 6,6 > 1. Then 
n = dt. Observe also that (j, t) = 1 and (6, t) = 1, since | 7. We look at 


o(n) w(d)u( a5) 
n 


d 
dn Pay 
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Since (6, t) = 1 and d|n, we have that d = (d, 5)(d, t). Let (d, 6) = d, and (d, t) = d2 so 
that d = djd2 Now, we can rewrite the above expression as 


did 
o(n) eee 
did 
° dys (Gap? 


d|t 


Observe that since (j,f) = 1 and (d,t) = do, then (d2, j) = 1, which implies that 
(d\d2, j) = (d, 7). So, we can again rewrite the expression for f(g) as 


o(n) 3 w(did2) wg a) 


did 
dy |6 (Cd; iy) 
d>\|t 


Using the facts that and ¢ are multiplicative, and that (d;, d2) = 1, we obtain 


fle) p(n) 3 (di) u(do) Mga 7H (a2) 
dy \5 ogy 7)? (a2) 


d3\t 


_ $@) 57 hue sy a 


a6 PCat ap 22) 


_ on) 57 Lauer es 
— d 


in CR ACCiIy oe) 


But since (j, 2) = 6, and (d, 6) = d, then (d,, 7) = d,. So we have 


p(n) t 
f(g) = > L(d,) | —~ 
n  ) eo 
= 0, since 6 > 1. (See [IR, p. 19]) 
So, we have shown that 


if g generates G 
otherwise. 


I, 
4 Reformulation and Solution of the Problem 


Let F, bea finite field with g = p” elements. Consider the polynomial ring F,,[x]. Let a(x) 
be a polynomial in F,[x]. We would like to know the number of irreducible polynomials 
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p(x) € F,[x] such that a(x) generates (Fp[x]/(p(x)))*. Recall that if deg p(x) = n then 
Fy[x]/(p(x)) = Fp» 


Moreover, Fin is cyclic of order p” — 1. Also recall that the isomorphism Fp[x]/(p(x)) = 
Fn is given explicitly as follows: for g(x) € Fp[x], we write 


g(x) = p(x)q(x) + r(x), ~withr(x) = Oor0 < degr < deg p = n. 
Let 6 € F. bea root of p(x). Then 


2(0) =r(0) =ap +a)O +---+an_10""', a; EF, 


describes all the elements of Fp”. Thus, a(x) generating (Fp[x]/(p(x)))* is equivalent to 
a(@) generating Fin 

Hence, to count the number of irreducible p(x) of degree n for which a(x) is a generator 
is tantamount to counting the number of 8 of degree n for which a(@) generates Fin . Indeed, 
since each p(x) has n roots, we find 


#{ p(x) € F,[x]: p(x) irreducible, deg p = n, a(x) generates (Fp[x]/(p(x)))*} 
l 
= — He € F,. :deg@ =n, a(@) generates Fin} 


We now apply the lemma of section 3 to see that the number in question is: 


l Mae d 
~ >. fae) =- > a I+ > > > x(a(9)) 
ge yn GcF yn Pp ee ord x =d 
deg 0=n deg 0=n - 


I P(p" — 1) Pe Aloe =) u(d) 

n pe p” ee | p" = a a do $@) d(d) a ) 
deg =n d>\ 

I P(p"-1) 1 o(p" — 1) L(d) 

= = $$ + aa See eineoe seme 

n py p"—I| n ps p" —1 py) p(d) 
deg pa deg hie d>I 
>> x(a(6)) 

ord x =d 


So, the contribution from the main term (the first term in the expression above) is 
p" _ p"! | p(p" —1) 
n pr —1 - 
and the error term is 
p(p" — 1) ud) 
maT ee dX, dL, xa). 
n(p ) ‘igh ¢(d) pe eee beF on 
deg @=n 
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We will show in the following sections that when d > 1, and a(x) is of the form x” + c, 
the sum 


Y* x(a(0))] < mp? (3) 
GEF -n 
deg A=n 


Hence, the contribution from the error term is O(p28(p" — 1)), where 5(u) is the number 
of divisors of u. 

The estimate in (3) is a consequence of A. Weil’s celebrated work proving the analogue of 
the Riemann hypothesis for zeta functions over finite fields. However, one can also obtain 
this estimate in a more elementary manner using Gauss sums. In section 5, we recount 
some of the theory of Gauss sums over finite fields. Then, in section 6, we use that theory 
to obtain the estimate in (3). 

Once we justify this estimate, it is clear that we have proven the function field analogue 
of Artin’s conjecture in the special case where a(x) = x” +c. To see this, notice that 
in our expression for the number of irreducible p(x) of degree n such that a(x) generates 
(F,[x]/(p(x)))*, the contribution from the main term far outweighs the contribution from 
the error term. 


5 Gauss Sums Over Finite Fields 


In this section, we review some basic properties of Gauss sums over finite fields. All of this 
can be found in Davenport[D]. 
Let p be prime and consider the finite field F,”. Let 


201 
e(a) = exp (“1 n/F («)) , fora e Fp. 
p Pp / 
Let x be acharacter of Bin The Gauss sum Is defined as 


t(x)= >> x(a)e(a) 


ack", 
P 
Observe that for any 7 £ 0, 


x()t(X) = D> X(@)x(ne@) 


acF*, 
P 
If we make the change of variables a = 78, we have that 


x(t) = D> x(B)e(nB) 


BekF* 


p" 


ok 
p" > 


since x(n) x(n) = |x(n)|* = 1 and as B ranges over elements of F*,,, so does nf. 
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Theorem: |t(x)| = p”/?, forall non-trivial characters x. 
Proof: If x 4 xo (the principal character), then 
ITP = Yo x@xX(e(a — B) 
a.peF in 
We seta = By. Then we have that 
ITP = YS) xyeBiy - 1). 
BEF in 
Notice that if w),..., Wy is an Fp-basis of Fp», then 
De e(p) = >, e(aywi +++++anWn) 
BeF jn Gisacs ancF , 
n 
=[]{ d e@us 
J= ajEF, 
Since 
ithe 2nia;Tr 
j Are, n/F,(W;) 
57 elajw;) = Yr exp (=A 
ajeF yp aj=0 P 
p, if Trp nsp,(wj) = 90 
0, otherwise. 
we have that 
s e(B) = p’, if Tr Fyn (Fp (W1) = +++ = Te pn (Fp (Wn) = 0 
0, otherwise. 
BeF pn 
In the first case, we deduce that 
Trp in /Fp (7) =0 Wn EF >. 
In particular, if 1, 0, 62, ..., 0"! is a basis, we find that 
C= aij = Tren /F p (a! tJ) 
n 
= SreWighi 
k=1 
andgs)) 0) are the conjugates of 6. The above equation can be rewritten as QQ? = 0, 
where Q = (0!) is a Vandermonde matrix. However, this is a contradiction since 


a) |. 9) are all distinct, which implies that det Q 4 0. 
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So, we have that 


> e(B) =0. 


BEF pn 
Now, if y 4 1, then 
dD) e#y — 1) = -1. 
BeF in 
If y = 1, then 
>> e(B(y —1)) =p" - 1. 
peF in 


Thus we have that 


I(r = p™-14+(-1) > x) 
y#1 
yeF in 
= (p" —1)+(—1)(-)), | since x is non-trivial 
os n 
= <p 
So we have shown that |t(x)| = p? , for all non-trivial characters x. O 


6 Estimation 


We will estimate 


d> x(a(o)) 


EF jn 


when a(x) = x” +c. For this purpose, we can rewrite the above as 


>> xo" +0). 


GeF pn 


If (m, p” — 1) = 1, then @ +» 9” is an isomorphism, and the sum becomes 


> x(0+c)=0. 


GEF pn 


If (m, p’ — 1) = t > 1, then write m = ts. Then, (s, p” — 1) = 1, and by the same 
reasoning, 


>> x" +c) = >) x +0). (4) 


O€F pn OeEF pn 
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Since t|(p” — 1), let Y be a character of F'n of order t. Then 


-1 
1S wii) = 1, ifa@ = 6! for some 6 
t = 0, otherwise. 


Putting this in expression (4) gives 


t—1 
Yx@to sty x(a+e)—wi(@) 
i=0 


GEF jn QeF pn 


t—l 
¥> > x@+o¥'@) 


i=0 aeF yn 


Now, we use Gauss sums to replace x (a + c) as follows: 


7 
x(a +0) = Yd X(B)e(B(@ + c)). 


BEF in 
Thus interpreting x (0) as zero, we have 
1 _ 
>> W@x@+e) = rH >> W@ > XBe(B(a +0) 
aeF yn x aeF pn BEF pn 


also a Gauss sum 


= —= )> x(B)e(Be) }) W!(w)e(Bar) 
T(x) BeF pn Q@eF jn 


] _ ey i 
= 7 2 X(B)e(Bc)¥' (p) rv") 
EF pn 


t(W") —) avi 
= WV 
7H) oe, X(B)Y (B)e(Bc) 


Observe that 
t(X)x(n) = D> X(@)e(an). 
aeF jn 
So we have that 
t(W') 
t(X) 


> wi@x@+c) = r(x W!)x(c). 


Q@eF n 
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Now, since x, xX, W', W are non-trivial characters, we apply the theorem of section 5 to 
see that 


> V@)x(at+o)| = p"?. 


Q@cF pn 


Now, we conclude by showing that when (m, p” — 1) =t > 1 


yd) xo" +0) Yd) x +0) 


OEF pn GeF pn 


t—] 


=|)° >> W(a)x(a +0) 


i=0 aeEF pn 


t—] 


So] do Wa@x(@+e) 


r=0 aeF jn 


lA 


t 
n 


—] 
p2 
=0 


lA 


i 


n 
mp2. 


lA 


Thus, for all m and c, we have that 


> xo" +0)| < mp?. 
GeF jn 
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Continuous Homomorphisms as Arithmetical Functions, 
and Sets of Uniqueness 


I. Katai 


This is a survey paper on the characterization of continuous group homomorphisms as arithmetical functions, 
and on sets of uniqueness with respect to completely additive functions. 


1 Introduction 


Let, as usual N, Z, Q, R, C be the set of positive integers, integers, rational, real, and complex 
numbers, respectively. Let Q,, Rx be the multiplicative group of positive rationals, reals, 
respectively. Let P be the set of prime numbers. 

For an arbitrary, additively written Abelian group G let Ag, resp. AG denote the classes 
of additive, resp. completely additive functions. A function f : N — G belongs to Ag 
if f(nm) = f(m) + f(n) holds for each pair of coprime m,n, and it belongs to AG 
if the above equation holds for all pairs m,n € N. If G is written multiplicatively, then 
we Shall write Mg, MG instead of Ag, AG, and the corresponding functions are called 
multiplicative, completely multiplicative. 

If G = R, then we shall write simply A, A* instead of Ap, Ag. 

If f € A*, then its domain N can be extended to Q,, by 


m 
f(=):= sem) - FO), 
and the functional equation 


f(rir2) = fri) + f(r2) 


remains valid for every r), r2 € Q,. 
Let us assume that G is a topological group and f : Q, — G is continuous at |. Then 
for each a € R,, there exists the limit 


lim f(r) =: P(@), 


® is continuous everywhere in R,., furthermore P(@fB) = P(a) + P(A) valid for all 
a, B € Rx, i.e. P is acontinuous homomorphism of R,, into G. 

On the other hand, if ® : R,, — Gis ahomomorphism, then its restriction to the domain 
N is a completely additive function. 
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Let S be an R-module, containing at least two elements, defined over an integral domain R 
which has an identity. Consider the set of all doubly infinite sequences (... 51, 50, 51,-.-) 
of elements of S. We introduce the shift operator E whose action takes a typical sequence 
{s,} to the new sequence {s,41}. If 


r 
Pe) = 3 c jx! 
j=0 


is a polynomial with coefficients in R, we extend this definition by defining 


r 
POs) Cina: 
j=0 

In this way we define a ring of operators which is isomorphic to the ring of polynomials 
with coefficients in R. Let J be the identity operator, and A := E — I. 

We shall say that an additive function f is of finite support, if it vanishes on the set of 
prime powers except possibly on the powers of finitely many primes. 

For z € R let ||z|| := mingez |z — k|. 


2 Characterization of log as an Additive Arithmetical Function 


The function f(n) = logn belongs to A*. Normally log is considered as a mapping R,, > 
R and in this context it is wellknown that continuity along with the functional equation 
f(xy) = f(x) + f(y) characterizes the logarithm up to a constant factor. Restricting the 
domain from R,, to N creates an interesting question: What further properties along with 
the (complete) additivity will ensure that an arithmetic function f is in fact c log n. 

The first result of this type was proved by P. Erdos [1] in 1946. 


Theorem 1 /f f € Aand Af(n) > Oforalln, or f(n) > ~ (n > ov), then f(n) isa 
constant multiple of log n. 


In [2] we proved 


Theorem 2 /f f € A and lim inf A‘ f(n) > 0 with some k € N, then f is a constant 
multiple of log n. 


An important progress has been achieved by E. Wirsing [3] proving the following con- 
jecture of Erdos. 


Theorem 3 /f f € Aand Af (n) > —K withsome constant K, then f(n) = c log n+u(n), 
where u(n) is bounded and c is a suitable constant. 


Another one of Erdos’s conjecture was proved in [4]. 


Theorem 4 /f f € Aand 
l 
— D_|Af(n)| — 0, (2.1) 


nox 


then f =c log. 
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Somewhat later the condition (2.1) was weakened by E. Wirsing. Namely, he proved 
in [5] 


Theorem 5 Let f € A. Assume that there exists a constant y > 1 and a sequence 
xX, < X29 <... such that 


xy! SY) |Af(n)| 0 @ — ov). 


X,<n<yX, 
Then f =c log. 


By making use of very original new ideas and some deep results on the distribution of 
primes in arithmetical progressions, E. Wirsing [6] proved 


Theorem 6 /f f € A* and Af(n) = o(log n), then f(n) = clog n. 


One can show easily that the following generalization of the preceding theorems hold true. 


Theorem 7 


(1) Let fi,gEeA lf 
(a) g(n+1)— f(n) > O, then f = g = c log; 
(b) g(n+1)— f(n) is bounded, then f(n) = c log n+u(n), g(n) = c log n+v(n), 
and u, v are bounded. 
(2) Let f,g € A*. Ifg(n+ 1) — f(n) = o(log n), then f(n) = g(n) = c log n. 


For the method of the proof of Theorem 7 see [7], [8]. 
In [9] and [10] I asked for a characterization of those additive functions which satisfy 


f(an+b)— f(An+ B) —C as n—-o (2.2) 


for some integers a > 0, A > 0, b, B, and real constant C. I considered it with B = O and 
small values of a and b in [9] and [10]. 

With general a and b but still with B = O satisfactory results has been achieved by 
Mauclaire [11]. 

Elliott solved this problem completely. Namely he demonstrated in [12] that if (2.2) 
holds, then there 1s a constant F such that 


J(m) = F log m 


holds for all m coprime to aAA, where A = aB — Ab, assuming A 4 0. Moreover he 
could give the values of f for those prime powers p® for which p|aAA. 
Another important assertion proved by Elliott is formulated as 


Theorem 8 Assume thataAA € 0. There exist positive constants c,c\ so that 


fim) fF) | @2 fi uk 


log m_ logan log m_ logn 
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holds uniformly for all integers m and n which satisfy 2 < m <n < e” and are prime to 
aA. Here 


L(x) = max | f(an + b) — f(An-+ B)|. 
n<xe 
The constants c, c, may depend ona, b, A, B. 


The best source for the proof of this theorem and other important results 1s the excellent 
book of Elliott [13]. Theorem 8 generalizes a result of Wirsing [6] which sounds as follows: 
Let B(x) be a positive non-decreasing function so that B(x) < 29/9 B(x). Let f € A 
such that f(2) > Oand f(n+ 1) — f(n) < B(n), for every n € N. Then, there is a suitable 


constant y so that 
m n 
ay B(m) = B(n) 
log m_ logan 


fim) f(n) 


log m_ logn 


uniformly for 2 <m <n <e”™. 
We shall say that a sequence of real numbers ¢,(n € N) is tight if 


| 
lim sup —#{n < x, |t,| > K} =:c(K) —>0O as K —>ow. (2.3) 


x00 X 


A. Hildebrand [14] proved that f(n + 1) — f(m), f € A has a limit distribution if and 
only if there exists a constant c such that h(n) := f(n) — clog n satisfies 


2 
3 min(1, h“(p)) Ais 
Pp 


(2.4) 
p 


Though explicitly it was not formulated but from this argument the following assertion 
follows immediately 


Theorem 9 Let f € A. Then (2.3) holds for ty = f(n+ 1) — f(n), if and only if (2.4) 
is satisfied. 


Later Elliott [15] went on to prove the following more general 


Theorem 10 Leta > 0, A > 0, b, B be integers which satisfy aB # Ab, and n(x) a real- 
valued function defined for x > 2. Let f\, f2 € A, and n(x) be an arbitrary function. Let 


| 
Fy (2) := ma < x|fi(an + b) — fo(An + B) — n(x) € 2}. 
The following three propositions are equivalent. 


(1) There is an n(x) so that the frequencies F,.(z) converge weakly to a distribution 
function as x > oO. 
(2) There is an n(x) so that 


lim lim sup(1 — Fy, (z) + Fy(—z)) = 0. 
<-> 00 x00 
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(3) There are realnumbers c,, cz such that for h ;(n) := f;(n)—c; log n the conditions 


» 


peP P 


min(1, h4(p)) 
aaa ramen 


hold. 


Let gy : [0, ©) — [0, ©) be a so-called subadditive function, i.e. monotonically increas- 
ing, p(x) —> o© aS x — ov, and the condition 


g(ixt+y)<ci(g(x%)+¢(y)) for x,y>1 (2.5) 


holds with a suitable constant c; > 0. 
We are interested in giving necessary and sufficient conditions for an additive f to satisfy 


Se f(nt lI) — fl) «Kx (« — ov) (2.6) 


nox 


Applying the argument we used in our paper [16] written jointiy with Indlekofer, one gets 


Theorem 11 Let g be a subadditive function. The relation (2.6) holds for an additive 
function f if and only if there exists a constant c such that h(n) := f(n) — clog n satisfies 


(2.4) and , 
3 p(|h(q”™)|) Be. (2.7) 


m 
ng™iz1 4 


where q™ runs over the set of prime powers. 


Proof: Necessity. Assume that (2.6) holds. Then Af(n) is a tight sequence, and so, 
by Theorem 8 we obtain the fulfilment of (2.4). Since Af(n) = Ah(n) + o(1), there- 
fore >), <, 9(IAK(n)|) « x. Let h(n) be written as the sum of the additive functions 
hy(n), h2(n), where hy is a strongly additive function defined for primes g such that 


_ prq) if|h(q)| <1, orifg =2 
WMG) e otherwise, 


and h(n) 1s defined by h2(n) := h(n) — hy (n). 

From (2.5) one gets easily that p(x) « x° for x > 1 with a suitable constant c. Further- 
more, from the generalized Turan-Kubilius inequality due to Elliott (see Lemma 1.4.[13]), 
together with (2.4) we obtain that 5), p(|Ahi(n)|) K doe, |Ahi@M|S « x, con- 
sequently, from the assumptions (2.6), (2.5), and |Ah2(n)| < |Ah,(n)| + |Ah(n)| we 
obtain that 

Y" p(\Aha(n)|) < c3x. (2.8) 


nox 


with a suitable constant c3, for all x > 2. From (2.8) we obtain (2.7) readily. Let P be the 
set of those primes g for which |h(q)| > 1. As we know (see (2.4)) DupeP 7 < oo. Let us 
choose an arbitrary Y > 1. For all B, (2 <)2’ < Y consider those integers n = 2P yy odd 
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for which y(n + 1) is square-free and coprime to P. By making use of the Eratosthenian 
sieve we can see that the density of these integers is Ms with a positive constant e which 


may depend only on P. Since h2(2’ y) = h2(2°), and the sequences defined for different 
f are disjoint ones, from (2.8) we get that 


e 
Dd. v(m QWF < cs. (2.9) 
2B<y 
Let now B = {q",q > 2,m>2}U{q Ee P,q #2}. LtQ<y, Sg be the set of those 
integers n = 2Qv for which v is odd, and v(2Qv + 1) is square free and coprime to P. 
By the Eratosthenian sieve we obtain that the asymptotic density of Sg is > e;/Q witha 


positive constant e;. Since h2(2Qv) = h2(2) + h2(Q), and the sets Sg are disjoint, we 
obtain that 


YS y(\h2(2) +hOG < cy. 
OQ<Y 
Hence we get (2.7) immediately. 

Sufficiency. Assume that (2.4), (2.7) hold true. Since g(|Af(n)|) « g(|cA log n|) + 
y({|Ahi(n)|) + p(|Ah2(n)|), therefore summing over n up to x, the first two sums on the 
right hand side are bounded by x, it remains to prove that 


V5 g(IAho(n)l) & x, 


Nn<SX 


which will follow if we show that 


S p(\h2(n)|) & x. (2.10) 


n<x 


Let J denote the set of those integers D for which p||D implies that h2(p) 4 0. The left 
hand side of (2.10) is bounded by 
h2(D 
x ae elh2(D)I)_ (2.11) 


Iterating (2.5) we obtain that 


g(lho(D)) < Yo c?G g(\no(q™))), 


q™||D 


where w(n) is the number of distinct prime divisors of n. Thus we have 


ype Ae = <1400 flea") eee) cOlgm) 


wes 

oS D q™|D q 
p(|h2(q”)|) ee) 
Hees 


qreT q D\eT 


On the right hand side both sums are convergent, and the proof is complete. C 
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As a special case we have the following 


Corollary 1 Let f € A. The inequality 


DIAS (a)Ie «x 


Nn<x 


holds with some constant a > O, if and enly if there is a suitable constant c such that for 
h(n) := f(n) —clog n 


h?(p) 
—— ee ; 
a(pyi<t 
and 
|h(q’" )|* 
ae 
In(g™|>1 4 
hold. 


This assertion for @ = 2 was proved earlier by Elliott [17]. 


3 Characterization of n* as a Multiplicative Function 


In a series of papers ({18] I-VI) I considered functions f € M under the conditions that 
A f(n) tends to zero in some sense. I could determine all those functions f, g € M* for 


which the relation 
OO 


1 
d_ clgn +k) — f(n)| < 00 (3.1) 


n=1 


with some fixed k € N holds. Namely I proved the following assertions. 


Theorem 12 /f f, g © M and (3.1) holds with k = 1, then either 


| f(n)| |g (n)| 
du —— < OO, z = <O, (3.2) 
Or 
f(n) =ge(n)=n?*"", oft ER, O<o <l. (3.3) 


Theorem 13 Let f,g © M* and k > 2 be fixed. Assume that (3.1) holds, furthermore 
that f(n) = g(n) = O if (n,k) > 1 and f(n) 4 0,g(n) £ O if (n,k) = 1. Then 
either (3.2) is satisfied or there exist F,G € M* ands € C with Rs < 1, such that 
f(n) =n‘ F(n), g(n) = n*G(n), and 


G(n+k)=F(n) (néN) (3.4) 


holds. 
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In [18, IV.] I determined all the solutions of (3.4) for completely multiplicative pairs 
of F, G and in [19] even for F, G € M under the additional condition that F(n) 4 Oif 
(n,k) = 1. The above assertions are not obvious even in the case g = f. 

An immediate consequence of Theorem 12 is that }~° Lin + 1) —dA(n)| = ~, 
where A is the Liouville function. This shows that the size of the integers n for which 
A(n) 4 A(n + 1) is not too small. 

Recently in a joint paper with B.M. Phong [20] we proved 


Theorem 14 Let k © N be fixed. Assume that F,G € M and (3.4) is satisfied. Then 
either 


Srp i={n|F(n) 40} and Sg := {n|G(n) # 0} 
are finite sets, or F(n) # 0 for every n coprime to k. 


A special case was treated earlier in [21]. 
In [22] I formulated the following 


Conjecture 1 If f ¢ M and 


I 
= DAF — 0, (3.5) 
then either , 
— S| f(n)| + 0 as x —> 00, (3.6) 
X 


NnoX 


or f(n) =n’, Ns < 1. 


Towards this conjecture, a few partial results are known. 

First, assuming that (3.6) does not hold, from (3.5) one can deduce that f € M*. This 
assertion was explicitly proved by Mauclaire and Murata [23] for functions f of modulus 1, 
but their method can be applied to the general case. 

The second observation is that either | f(”)| > 1 for every n, or (3.6) holds. Indeed, let 
If(@l =e <1,S(x) = Dye, If). Then 


[x/q]+1q-1 [x/q}+1 
Six)< Yo Dolflmat+ Dl < Yo alfOnalt+ >>> ol feng +i) - fg). 
m=1 j=0 m=1 m j 


According to (3.5), the second sum on the right hand side is smaller than €, (e > O arbitrary), 
if x is large enough, the first sum is qoS(ta] + 1), consequently, 


S({= l 
SO) sei eee 
x x/q 


’ 


whence S(x)/x — 0 immediately follows. 
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Moreover arguing similarly, one can deduce that if (3.6) does not hold, then | f(n)| = n° 
with a constanto,0 < o < 1. Let t(n) := f(n)n-°, and assume that o > O. Since 
t(n+1)—t(n) = f(nt+1)(n4+1)7% —n-°)4+ (Af (n))n~“, therefore 


a )| JAF (n)| 
Fra - «> +») Ti : 


nox hee nox 


The right hand side is clearly convergent, therefore Theorem 12 can be applied, whence we 
obtain that t(n) = n'', tr E Rie. f(n) =n',0 < Rs < 1. 
The case, when f(n) is of modulus 1 seems to be very hard. Hildebrand [23] proved 


Theorem 15 There exists a positive constant c with the following property. If g € M*, 
lg(n)| = 1 forn € N and for every p € P,|g(p) — 1| < c, then either g(n) = 1 
identically, or 


l 
liminf — A 0. 3.7 
im inf — | g(n)| > (3.7) 


nox 


By using the ideas of Hildebrand and some of mine, I obtained [18, VI.] 


Theorem 16 Let g € M”*, |g(n)| = 1 forn € N. There exist positive constants B < 1 and 


d such that 
“py — 1 
lim sup > Isp) 7 <6 (3.8) 
P p 
xP<p<x 
and 
i 
liminf- > |Ag(n)| =0 (3.9) 
- x/2<n<x 


imply that g(n) = 1. 
Let 0 : [1, ©) — [1, co) be a slowly varying function, i.e. such that 


Qo) 
O(x) 


lim max 
X00 ye[x/2,x] 


-1]=0 


Let (2 denote the set of all arithmetical functions having complex values. f € Q is 
considered as an infinite dimensional vector, the n’th coordinate of which is f(n). Let 
a > | be aconstant and Q,., be the subspace of 22 which consists of those x € Q for which 


—— J} |xnl* 


n<y 


= rr yo(y)® 


is finite. 
Let Log = MN Qa.9; Lag =A Qa.o- 
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In a joint paper with Indlekofer [24] we proved 


Theorem 17 /[f f «¢ M, P € C[z], P #0,k = deg P, and 
P(E) F € Qa. 
then either f € Ly o, or f(n) =n*u(n), where 0 < Kis < k and 
P(E)u =0 
The next assertion was proposed by myself as a conjecture and proved by Wirsing in 1984. 


Theorem 18 /f f ¢€ M,Af(n) ~ Oasn > ow, then f(n) ~ Oasn > wor 
firy=n*,0O< Hs < 1. 


This theorem has been proved some years later independently by Tang and Shao. The 
joint paper of Wirsing, Tang and Shao [25] contains two different proofs. 

Wirsing’s theorem can be formulated in the following way: If F € Aand ||AF(n)|| > 0, 
then with some suitable constant A € R we have that F(n) — A log n is integer for every 
neN. 

In other words, if T is the group of the reals mod 1, and F € Av, AF(n) — 0, then F 
is a restriction of a continuous homomorphism from R,, to T. 

B.M. Phong proved the following generalization of Wirsing’s theorem. 


Theorem 19 Let A, B be positive integers and let D be a real constant. If h € Aj and 
h(An+ B) —h(n) -D—O0O as n—o, 
then h is the restriction of a continuous homomorphism: Ry. — T. 


For A = | this assertion was generalised by Tang [29]: 


Theorem 20 Let B be a fixed positive integer, f a multiplicative function defined on the set 
of the integers n coprime to B, such that | f (n)| = 1 and f(n+ B)— f(n) > O,n > ~. 
Then there must be a t € R such that f(n) = n'*xyp(n), where x p(n) is a Dirichlet- 
character mod B. 


By using this assertion one can completely characterize all those multiplicative functions 
f of modulus 1, for which P(E) f(n) —> 0, (n — ov) holds. (For this see [18, I]) 
In a joint paper with N.L. Bassily [28] we proved 


Theorem 21 /f f,g «© Mand g(2n+1)—Cf(n) > Owith some nonzero constant C, then 
either f(n) > Oasn > ow, orC = f(2), f(n) =n’,0< Rs < 1, and g(n) = f(n) for 
every odd n. 


The complete description of those f, g € M for which g(An + B) —Cf(an+b) > 
O(n — ©) Is not given yet. 
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4 On Additive Functions mod 1 


T is considered here as the additive group R/Z. We say that F € Ar is of finite support 
if F(p*) = O holds for every large prime p, and every a € N. For Fy € Ar(v = 
O,1,...,k — 1) let 


LyrCFo,..., Fe-1) := Fo(n) +---+ Fe-1(aa+k —1). (4.1) 


Conjecture 2 Let ce be the space of those k-tuples (Fo, ..., Fx—1) of Fy € Ar for which 
Lr(Fo,..., fe-1) =90 (neN) (4.2) 


holds. Then each F; is of finite support, and Lc is a finite dimensional Z module. Let 
Gj(n) = Tt; log n (mod1), tT] +--- + T%~1 = 0. Then Ly(Go, Gj,..., Ge_1) > Oas 
n—> OO. 


Conjecture 3 If F, € Ar(v = 0,...,k — 1), and 
Lyn(Fo,..-, Fk-1) —> 0 (n —> ov), 


then there exist suitable real numbers T9,..., T—, such that t + --- + t|%_,; = O, and for 
Hj(n) := Fj(n) — t; log n we have 


by CAG ein ea ae Oc) 


Remarks: 


1. Conjecture 3 for k = 1 can be deduced easily from Wirsing’s theorem. 

2. Conjecture 2 was proved for k = 3 under the more strict condition that F, € Aj 
in [30]. We obtained that (4.2) implies that F,, = 0 (v = 0, 1, 2) identically. 

3. Conjecture 2 for k = 3 was proved completely by R. Styer [31]. 

4. M. Wijsmuller treated similar problems for additive functions defined on the set of 
Gaussian integers taking values from T. See [32], [33]. 


Let P(n) be the largest and p(n) the smallest prime divisor of n. 


Conjecture 4 For every integer k(> 1) there exists a constant cy, such that for every prime 
p greater than cx, 
min max P(jp+l) <p (4.3) 
l<j  te[—k,k] 
Pii<p #0 


holds. 
We are unable to prove it even for k = 2. 


Proposition 1 Let Eo be the space of those |-tuples (Fo, ..., Fi-1) of Fy € A} for which 


LEn(Fo,..., Fi-1) = 0 (n € N). Assume that Conjecture 4 is true fork = 1. Then js is 
a finite dimensional space. 
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Proof: Let (Fo, =a Fi_-1) be such an element of as for which F; (qg) = O for every 
q < max(c;,/) and j = O,...,/ — 1. We shall prove that F;(n) = 0 foreveryneN,j = 


O,...,/—1. Assume the contrary, and let M be the smallest integer for which F.(M) #0 
for some ¢t € {0,...,/ — 1}. Then M should be a prime. Since 


l 
Ljm—(Fo,..., Fit) = 9) FGM —T +i) =0, 
i=0 
from (4.3), by choosing that j for which (4.3) is attained (with M = p), we obtain that 
F,(M) = 0. 


Hence it follows that the initial values F;(q), 7 = 0,...,/—1; q < max(c;, 1) completely 
determine the functions F;, if they are correlated according to (4.2). 
The proof is complete. O 


Let K be the closure of the set {L, (Fo, ..., Fx—1)|n € N}. 


Conjecture 5 If Fo,..., Fx—-1 € Aj} and K contains an element of infinite order, then 
| — 2 


This conjecture is obvious if k = 1, and it seems to be hard for k > 2. Recently, in our 
joint papers with M.V. Subbarao [34], [35] we obtained some partial results. This will be 
explained in the remaining part of this section. 

Let Ey = {u/k|u = 0,1,...,k — 1}, Le. the group of those elements a € T for which 
ka = 0. A special case of Conjecture 5 would be 


Conjecture 6 Let f € Aj, and H = {a1,...,a,} be the set of the limit points of the 
sequence f(n + 1) — f(n)(n € N). Then H = E,, and there exists a real number t such 
that f(n) = tlog n + U(n) (mod 1), U(N) = Ex, and for every wm € Ex, there exists a 
subsequence n, of integers such that U(n, + 1) — U(n,) =o. 


We proved 


Theorem 22 


1) Conjecture 6 is true fork = 1, 2, 3. 

2) Letk = 4, and assume that the conditions of Conjecture 6 are satisfied. Then there 
isat € R such that f (n) = t log n + U(n) (mod 1) and either (A) or (B) hold: 
(A) 7H = E4, U(N) © Eg 
(B) H consists of four distinct elements of Es, i.e. H = fac! K 2, 3, Ks}, where K 

is any nonzero element of E5, moreover U(N) C Es and U(n+1)—U(n) € H 
for every large n. 


Remark: We think that case (B) cannot hold, which would foilow if we could prove that 
U(N) = Es implies that for every a € E5, U(n+ 1) — U(n) = @ occurs infinitely often. 
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5 Characterization of Continuous Homomorphisms 
as Elements of Ac for Compact Groups 


We investigated this topic in a series of papers written jointly with Z. Daréczy [36-41]. 

Assume in this section that G is a metrically compact Abelian group supplied with some 
translation invariant metric g. An infinite sequence {x,}°°_, in G is said to belong to Ep, if for 
every convergent subsequence Xy,, Xn,,.-.. the “shifted subsequence” Xp ,41, Xny41,--- iS 
convergent, too. Let €, be the set of those sequences {x,}°°_, for which Axy = Xn41—Xn > 
O(n — oc) holds. Then Eq C Ep. We say that f € Ap ‘belones to AG (A) (resp. AG (D)) 
if the sequence { f(n)}P° , belongs to E, (resp. Ep). 

We proved the followin g assertions. 


(1) AG (A) = AG (DP). 
(2) If f E AG (D), then there exists a continuous homomorphism ® : R, — G such 
that Fin) = P(n)(n EN). 
The proof of (2) was based upon the theorem of Wirsing (Theorem 18). 
The set of all limit points of {f(n)}P°., form a compact subgroup in G which is 
denoted by S,+-. 
(3) f € AG(D) if and only if there exists a continuous function H : Ss — Sy such 
that f(n + 1) — H(f(n)) > Oasn > ~. 
(4) In [41] we characterized those f € A% for which with some continuous function 
F: Sz — Sf the relation f(2n — 1) — F(f(n)) ~ 0(n — ov) holds. For G = T 
we obtained that either f(n) = O for every odd n, or there exists a nonzero A € R 
such that f(n) = A log n(mod 1) for every n € N. 
In [44] we solved the following problem. Let G;, G2 be metrically compact Abelian 
groups with some translation invariant metrics. Let f € AG, g € AG, and assume 
that with some continuous function F : Sf — S, the relation g(n—1)—F(f(n)) > 
O(n > o) holds. E.g. for Gj = T we proved: Under the above conditions, either 
g(n) = 0 identically, or there exist rt € R, M € N,u € Ag,, such that f(n) = 
i log n+ u(n) mod 1. Let A(n) := Mf(n)(n € N). Then the correspondence 
A(n) < g(n)(n € N) generates a topological isomorphism between S, and S,. The 
converse assertion is also true. 
(6) Further interesting results were obtained by Phong [42], [43]. 
(7) The main problem we are interested in is the following one: 
Let fj € Ac, (j =9,...,k — 1), Gj be compact groups, en := { fo(n), fi(n + 1), 
.» fe-1(n+k—1)}. Thene, € Sz, x--- x Sz, (=: U). What can we say about 
the functions f; if the set of limit points is not everywhere dense in U? We shall 
formulate our guesses only for special cases. 


(5 


~~ 


Conjecture 7 Let f € AZ,S¢ = Tien = (f(n),..., f(n +k — 1)). Then, either 
f(n) = A log n (mod 1) with some d € R, or {e,|n € N} is dense in =T x---x T. 


Conjecture 8 Let f,g € AZ,Sf = Sp = Tren := (f(n),g(n + 1)). If en is not 
everywhere dense in 77 = T x T, then f and g are rationally dependent continuous 
homomorphisms, i.e. there exist A € R, s € Q such that g(n) = sf(n)(mod 1), f(n) = 
A log n (mod 1). 


196 I. Katai 


Mauclaire proved in [45] that if G is an arbitrary locally compact group and f € Ag 
satisfies Af(n) — 0 (n — oo) then f is the restriction of a continuous homomorphism 
y : Ry — G. Ruzsa and Tijdeman proved [46] that it cannot be generalized for all groups. 


6 Sets of Uniqueness for Completely Additive Functions 


Definition: We say that E C N is a set of uniqueness for the functions belonging to A* if 
f € A*, f(E) = O implies that f(N) = 0. 


I introduced this notion in [47], and in [48] it was proved that if to the sequence of 
“prime + one’’s we adjoin a finite set of integers then we obtain a set of uniqueness. My 
guess that the set of shifted primes itself is a set of uniqueness, was proved by Elliott [49]. 

It was proved by Wolke [49], and Dress and Volkman [50], that in order for a set E to 
be such a set of uniqueness, it is necessary and sufficient that every positive integer n has a 
multiplicative representation: 


n =| a’ a Er, epi: 
Si i 


The h, k may vary with n. They used vector spaces over the field of rational numbers. 
In [52] Elliott proved my further conjecture, namely that if f € A*, M(x) = maxy<, 
If(n)|, E(x) = maxp<x | f(p + 1)I, then 


M(x) < AE(x®) x>2 (6.1) 


holds with suitable numerical constants A, B. For the wider class f € A he got a weaker 
result, namely that 
M(x) < AE(x®) + AM((log x)©) 
for some C > 0. 
Wirsing extended (6.1) for f € A[53]. He proved that every n € N has a representation 


k 
n=] [i+ D* 
i=l 


where A and k are bounded, ¢; = +1, and the primes p; lie in an interval n < pj < n°. 


In particular, Wirsing’s result showed that for the multiplicative group K generated by the 
“prime plus one’s Q, /K has bounded order. 

Another interesting consequence of Wirsing’s result is that f € A, f(p +1) > O(pe 
P) implies that f(n) = 0. 

My motivation with the investigation of the set of shifted primes was the following. 
In 1968 I proved [54] that f € A has a limit distribution on the set of shifted primes if the 


three series ; 
l 
z ftp) \> f (P) > I (6.2) 
ist PP ipimist Pippen 
are convergent. But the question of the necessity of these conditions remained open. 
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The necessity of the convergence of the series was proved by additional assumptions: 
a) if f(p) = 0, by Elliott [55]; b) if f(p) = OC), by Katai [56]. Finally it was proved 
without any other conditions by Hildebrand [58] in 1988. From his result it follows that, if 
f € Asatisfies 


#[p<x:|f(pt+l)|2e}—>0 (x —> oo) 
(x) 
for every € > 0, then f(n) = 0 identically. 
The notion of sets of uniqueness can be extended into group valued arithmetical functions. 


Definition 2, Let G be an arbitrary Abelian group. We say that E C N is a set of uniqueness 
for the class of functions in AG if f € AG, f(E) = O implies that f(N) = 0. 


For G = T the following assertion has been proved by Meyer [58], Indlekofer [59], 
Dress and Volkman [51], see also Elliott [60]: 

In order that E would be a set of uniqueness for the class Aj. it is necessary and sufficient 
that every positive integer n has a representation 


d 

— J 

n= | | a’ 
j=l 


with some integers d;, positive, negative or zero. 
Probably, the set of “prime plus one’s is a set of uniqueness for A7. but it does not seem 
to be easy. Presently it is not disproved even that FQ (p+1) = 0(mod 1) for every large p. 
In my paper [15] implicitly it was proved that there is a constant L such that every integer 
n has a representation 


k 
n=AT[(Hi+D", ef =H, 
i=] 


where A is such a rational number in the reduced form of which all prime factors are less 
than L. The constant L was implicit, since J used the Bombieri- Vinogradov theorem. Later 
Elliott [61] proved that L = 10°87 is appropriate. 

This bound is extremely large for computation. If we could reduce it to 10!7, say, then 
with a massive computer calculation perhaps we could prove that K = Q,.. 

Recently Elliott [62] proved that the factor group Q,,./K is either trivial or is of order 2, 
or 3. 

Schinzel and Sierpinski in 1958 stated the conjecture [62], that every positive rational 
has infinitely many representations of the form (p + 1)(q + 1)7! with p,q € P. From this 
K = Q,. would immediately follow. 

By using the method of Chen [63] one can prove that every natural number 7 has infinitely 
many representations of the form (P2 + 1)(Q2 + 1)~!, where P2, Q>2 run over the integers 
the number of prime factors of which is at most 2. Consequently the multiplicative group 
K, generated by set P2 + 1, where P> runs over the integers having at most two prime 
factors equals to Q. 


198 I. Kdtai 
References 


1. P. Erdos, On the distribution function of additive functions, Ann. Math. 47 (1946), 1-20. 

2. I. Katai, A remark on additive arithmetical functions, Annales Univ. Sci. Budapest, Sectio Math. 
10 (1967), 81-83. 

3. E. Wirsing, A characterization of log nas an additive arithmetic function, Symposia Mathematica, 
Instituto Nationale de Alta Mathematica 1V (1970). 

4. I. Katai, On a problem of P. Erdos, J. Number Theory 2 (1970), 1-6. 

5. E. Wirsing, Characterization of the logarithm as an additive function, Proceedings of the 1969 
Summer Institute of number theory, prepared by the Amer. Math. Soc. (1971), 375-381. 

6. E. Wirsing, Additive and completely additive functions with restricted growth, Recent Progress 
in Analytic Number Theory, vol. 2, London, 1981, pp. 231-280. 

7. I. Katai, On additive functions, Publ. Math. Debrecen 25 (1978), 251-257. 

8. I. Katai, Characterization of log n, Studies in Pure Mathematics (to the memory of Paul Turan), 
Akadémiai Kiad6, Budapest, 1984, pp. 415-421. 

9. I. Katai, Some results and problems in the theory of additive functions, Acta Sci. Math. Szeged 
30 (1969), 305-311. 

10. I. Katai, On number-theoretical functions, Colloquia Mathematica Sociates Janos Bolyai, vol. 2., 
North-Holland, Amsterdam, 1970, pp. 133-136. 

11. J.L. Mauclaire, Sur la regularité des fonctions additives, Seminaire Delange-Pisot-Poitou, 
Theorie des Nombres, Paris 15 (1973/74), no. 23. 

12. PD.T.A. Elliott, On additive arithmetic function f(n) for which f(an + b) — f(cn + d) is 
bounded, J. Number Theory 16 (1983), 285-310. 

13. P.D.T.A. Elliott, Arithmetic functions and integer products, Springer V., New York, 1985. 

14. A. Hildebrand, An Erd6s-Wintner theorem for differences of additive functions, Trans. Amer. 
Math. Soc. 310 (1988), 257-276. 

15. P.D.T.A. Elliott, The value distribution of differences of additive arithmetic functions, J. Number 
Theory 32 (1989), 339-370. 

16. I. Katai and Indlekofer, K.-H., Estimation of generalized moments of additive functions over the 
set of shifted primes, Acta Sci. Math. 56 (1992), 229-236. 

17. P.D.T.A. Elliott, Sums and differences of additive arithmetic functions in mean square, J. reine 
und angewandte Mathematik 309 (1979), 21-54. 

18. I. Katai, Multiplicative functions with regularity properties, I-VI, Acta Math. Hung. 42 (1983), 
295-308; 43 (1984), 105-130, 259-272; 44 (1984), 125-132; 45 (1985), 379-380; 58 (1991), 
343-350. 

19. I. Katai, Arithmetical functions satisfying some relations. Acta Sci. Math. 55, 249-268. 

20. I. Katai and Phong, B.M., On some pairs of mulitplicative functions correlated by an equation, 
New Trends in Probability and Statistics, Analytic and Probabilistic Methods in Number Theory, 
TEV, Vilnius, Lithuania 4, 191-203. 

21. J. Fehér, Katai, I. and Phong, B.M., On multiplicative functions satisfying a special relation, Acta 
Sci. Math. (accepted). 

22. I. Katai, Some problems in number theory, Studia Scient. Math. Hung. 16 (1981), 289-295. 

23. A. Hildebrand, Multiplicative functions at consecutive integers II., Math. proc. Cambridge Phil. 
Soc. 103 (1988), 389-398. 

24. K.-H. Indlekofer and Katai, I., Multiplicative functions with small increments III., Acta Math. 
Hung. 58 (1991), 121-132. 

25. E. Wirsing, Tang Yuansheng and Shao Pintsung, On a conjecture of Katai for additive functions, 
J. Number Theory 56 (1996), 391-395. 

26. I. Katai, Additive functions with regularity properties, Acta Sci. Math. 44 (1982), 299-305. 


Continuous Homomorphisms as Arithmetical Functions 199 


27. B.M. Phong, A characterization of some arithmetical multiplicative functions, Acta Math. Hung. 
63 (1994), 29-43. 

28. N.L. Bassily and Katai, I., On the pairs of multiplicative functions satisfying some relations, 
Aequationes Math. 55 (1998), 1-14. 

29. Tang Yuansheng, A reverse problem on arithmetic functions, J. Number Theory 58 (1996), 
130-138. 

30. I. Katai, On additive functions satisfying a congruence, Acta Sci. Math. 47 (1984), 85-92. 

31. R.Styer, A problem of Katai on sums of additive functions, Acta Sci. Math. 55 (1991), 269-286. 

32. M. Wijsmuller, Additive functions on the Gaussian integers, Publ. Math. Debrecen 38 (1991), 
255-262. 

33. I. Katat and Wijsmuller, M., Additive functions satisfying congruences, Acta Sci. Math. 56 (1992), 
63-72. 

34, I. Kdtai and Subbarao, M.V., The characterization of n'™ asa multiplicative function, Acta Math. 
Hung. (accepted). 

35. I. Kétai and Subbarao, M.V., On the multiplicative function n'™, Studia Sci. Math. (accepted). 

36. Z. Daréczy and Katai, I., On additive arithmetical functions with values in the circle group, Publ. 
Math. Debrecen 34 (1987), 307-312. 

37. Z. Daréczy and Katai, I., On additive arithmetical functions with values in topological groups, 
Publ. Math. Debrecen 33 (1986), 287-292. 

38. Z. Daréczy and Katai, I., On additive number-theoretical functions with values in a compact 
Abelian group, Aequationes Math. 28 (1985), 288-292. 

39. Z. Daréczy and Katai, I., On additive arithmetical functions with values in topological groups II, 
Publ. Math. Debrecen 34 (1987), 65-68. 

40. Z. Daréczy and Katai, I., On additive functions taking values from a compact group, Acta Sci. 
Math. 53 (1989), 59-65. 

41. Z. Daréczy and Katai, I., Characterization of additive functions with values in the circle group, 
Publ. Math. Debrecen 36 (1989), 1-7. 

42. B.M. Phong, Note on multiplicative functions with regularity properties, Publ. Math. Debrecen 
41 (1992), 117-125. 

43. B.M. Phong, Characterization of additive functions with values in a compact Abelian group, 
Publ. Math. Debrecen 40 (1992), 273-278. 

44. Z. Daréczy and Katai, I., Characterization of pairs of additive functions with some regularity 
property, Publ. Math. Debrecen 37 (1990), 217-221. 

45. J.L. Mauclaire, On the regularity of group valued additive arithmetical functions, Publ. Math. 
Debrecen 44 (1994), 285-290. 

46. I.Z. Ruzsa and Tijdeman, R., On the difference of integer-valued additive functions, Publ. Math. 
Debrecen 39 (1991), 353-358. 

47. I. Katai, On sets characterizing number-theoretical functions, Acta Arithm. 13 (1968), 
315-320. 

48. I. Katai, On sets characterizing number-theoretical functions II (The set of ‘‘prime plus one’s is 
a set of quasi uniqueness), Acta Arithm. 16 (1968), 1-4. 

49. P.D.T.A. Elliott, A conjecture of Katai, Acta Arithm. 26 (1974), 11-20. 

50. D. Wolke, Bemerkungen tiber Eindeutigkeitsmengen additiver Functionen, Elem. der Math. 33 
(1978), 14-16. 

51. F. Dress and Volkman, B., Ensembles d’unicité pour les fonctions arithemtiques additives ou 
multiplicatives, C. R. Acad. Sci. Paris Ser. A 287 (1978), 43-46. 

52. P.D.T.A. Elliott, On two conjectures of Katai, Acta Arithm. 30 (1976), 35-39. 

53. E. Wirsing, Additive functions with restricted growth on the numbers of the form p + 1, Acta 
Arithm. 37 (1980), 345-357. 


200 I. Kdtai 


54. 


55. 


56. 
a7. 


58. 


59. 


60. 


61. 


62. 


63. 


I. Katai, On distribution of arithmetical functions on the set of prime plus one, Compositio Math. 
19 (1968), 278-289. 

P.D.T.A. Elliott, On the limiting distribution of f(p + 1) for non-negative additive functions, 
Acta Arithm. 25 (1974), 259-264. 

I. Katai, Some remarks on additive arithmetical functions, Litovsk. Mat. Sb. 9 (1969), 515-518. 
A. Hildebrand, Additive and multiplicative functions on shifted primes, Proc. London Math. Soc. 
53 (1989), 209-232. 

J. Meyer, Ensembles d’unicité pour les fonctions additives. Etude analogue dans les cas des 
fonstions multiplicatives, Journées de Théorie Analytique et Elémentaire des Nombres, Orsay, 2 
et 3 Juin, 1980. Publications Mathématiques d’ Orsay, 50-66. 

K.-H. Indlekofer, On sets characterizing additive and multiplicative arithmetical functions, 
Illinois J. of Math. 25 (1981), 251-257. 

P.D.T.A. Elliott, On representing integers as products of integers of a prescribed type, J. Australian 
Math. Soc. (Series A) 35 (1983), 143-161. 

P.D.T.A. Elliott, On representing integers as products of the p + 1, Monatschrifte fiir Math. 97 
(1984), 85-97. 

P.D.T.A. Elliott, The multiplicative group of rationals generated by the shifted primes, 
I (manuscript). 

J. Chen, On the representation of a larger even integer as the sum of a prime and the product of 
at most two primes, Sci. Sinica 16 (1973), 157-176. 


Eotvos University 

Math Institute 

Budapest, Muzeum Krt. 6-8 

1088, Hungary 

E-Mail: katai@compalg.elte.hu 
imre.katai @elte.hu 


Hambur¢ger’s Theorem on ¢(s) and the Abundance 
Principle for Dirichlet Series with Functional Equations 


Marvin I. Knopp 


I Introduction 


Ask any mathematician - indeed any number theorist - to state Hamburger’s theorem; 
chances are the response will be something like, “Riemann’s function f(s) is uniquely 
determined by its functional equation.” In fact, this is correct, as far as it goes, but (as is 
often the case) closer examination show that it does not go nearly far enough. 

Hecke grasped the subtleties inherent in Hamburger’s theorem (1921) at least by 1944. 
In his final published paper [8], appearing that year, Hecke describes two versions of the 
theorem. [ quote from the introduction to [8] (transiation mine): 


“The analytic function y(s) of the complex variable s is determined up to a constant by 
the following conditions: Put R(s) = 2~*T'(s)g(s)”. 


1. With a suitable polynomial P(s) suppose that P(s)g(s) is an entire function of 
finite genus. 
2. Suppose g(s) satisfies the equation 


R(s) = R (5-5). 


3(a). Suppose that not only g(s), but also g(s/2), can be expanded in a Dirichlet 
series convergent somewhere:!y(s) = Se b(n)n~25, This condition can also 
be replaced by 

3(b). Suppose that the only pole allowed for y(s) is s = 1/27; but we assume only the 
expressibility of y(s) itself as a Dirichlet series g(s) = ae b(n)n~*, not that 
of g(s/2). 


“Mr. Hamburger first proved that g(s) is uniquely determined by 1, 2, 3 (a) and thus = 
const. ¢(2s)” [5]; “that also 1,2,3 (b) suffice I have proved within the framework 
of a general investigation, by means of reduction to the theory of certain automorphic 
functions [6].” 


While Hamburger discovered and gave the first proof of the well-known theorem bearing 
his name, for our purposes Siegel’s elegant proof, published one year after Hamburger’s, has 
greater relevance. The two formulations described by Hecke are in some ways quite distinct, 
but Siegel’s (and, indeed, Hamburger’s) proof of the Hamburger version and Hecke’s proof 
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of his own version are closely linked by their common use of the Mellin transform or, more 
accurately, its inverse. The idea in both cases is to show that the inverse Mellin transform of 
the function R(s) described by Hecke is a constant times 3 (z) — 1, where 7 1s the classical 
Jacobi function, given by 


o @) 


ore) 
9(z) = > er inez aes Dy errs, (1.1) 
n=} 


n=—-CO 


forzin H = {z € C|Im z > O}. From this fact it follows immediately that g(s) = 
const. ¢(2s). 

As we shall observe in §II.2 (see especially (2.6)), the Hamburger condition 3(a), above, 
implies immediately that the inverse Mellin transform of R(s) has the form 


S b(nye™"?, (1.2) 
n=1 


and thus has the general shape of (z) — 1, even before the application of conditions 1 
and 2. Siegel then shows that the two latter conditions lead to a “modular relation’ for 
the series (1.2), not that of a modular form, but the more general relation of a “modular 
integral.” (See (2.12), below, and §IV.2 for the definition.) As it turns out, in the presence 
of (1.2) this more general relation suffices to imply that the function defined by (1.2) equals 
const.{#(z) — 1}, and thus to conclude the proof. 

The Hecke condition 3(b), on the other hand, gives nothing like (1.2) (only that F(z), the 
inverse Mellin transform of R(s), has the form }-°°., B(n)e” nz) but the severe restriction 
on the singularities of g(s) in 3(b), together with condition 2, implies instead that with ao 
suitably chosen, F(z) + ao is a modular form of weight . possessing precisely the same 
transformation properties as does ?(z). Hecke then invokes a familiar uniqueness result on 
modular forms to conclude that F(z) + ap = agvt(z), and thus that g(s) = 2a9C (2s). 

With these contrasting versions of Hamburger’s theorem in mind, it appears natural to 
relax both the expressibility of g(s/2) as a Dirichlet series in 3(a) and the restriction on the 
poles of y(s) in 3(b), to conjecture that g(s) is uniquely determined by 1, 2 and 


3. Suppose (only) that g(s) can be expanded in a Dirichlet series convergent somewhere. 


While appealing, this conjectured “strong Hamburger’s theorem”’ fails spectacularly. Indeed, 
[12, Theorem 1] presents the 

Abundance Principle for Dirichlet Series with Functional Equation. There exist 
infinitely many linearly independent Dirichlet series satisfying the conditions 1, 2 and 3. 

There are generalizations of this Principle. For detailed statements see §V.1, below, [12, 
§§I & V], and [13, Theorem 1]. 

The proofs of the Principle and its generalizations fall into two steps. The first is an appli- 
cation of the Riemann-Hecke correspondence, as extended by Bochner [1], to translate the 
question of existence of the desired Dirichlet series into a question of existence of modular 
integrals with equivalent properties (§1V.4). The second step is the construction, by means 
of Eichler’s generalized Poincaré series [4, 11], of infinitely many linearly independent 
modular integrals of the appropriate kind. (See §V.2 for further details.) 
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II Siegel’s proof 


1. Preliminary observations. We begin by outlining Siegel’s celebrated proof of 
Hamburger’s theorem [17], above all because of its relevance to our point of view. Indeed, 
his proof foreshadows our approach to the Abundance Principle, featuring an application 
of the Riemann-Hecke-Bochner correspondence, fourteen years before Hecke developed 
it as a systematic link between modular forms, on the one hand, and Dirichlet series with 
functional equations, on the other [6, 7], and twenty-nine years before Bochner’s general- 
ization [1, 3] to modular integrals. (We stop short of claiming that Siegel’s work antedates 
Riemann’s invention of the correspondence in the latter’s derivation of the functional equa- 
tion of ¢(s) from the transformation formula of #(z) under z > —1. See (4.1), below.) 

Furthermore, aside from Hurwitz’s construction of the Eisenstein series E of weight 2 
on the full modular group [9], Siegel’s proof contains the first (to my knowledge) published 
example of a modular integral with log-polynomial period function. (See (2.12) and §IV.3, 
below, for the definition.) It is certainly the first occurrence of a modular integral within 
the context of the Riemann-Hecke-Bochner correspondence.* (Of course, Siegel proceeds 
to show that the modular integral is a multiple of #(z) — 1 and thus, in fact, a modular 
form; however, this small irony does not diminish the point.) A model of mathematical 
insight and elegance, this proof is relevant to research today, notwithstanding the passage 
of three-quarters of a century. 

The statement that Siegel-like Hamburger before him-proves differs from Hecke’s descrip- 
tion of it in two respects. It posits the existence of two Dirichlet series 


f(s)= Diaan™, g(s)= ban (2.1) 
n=] n=] 


and a polynomial P(s), such that 


(i) P(s) f(s) is an entire function of finite genus; 
(ii) f(s) converges absolutely for o = Res > 2 — 6 (some 6 > 0); 


(iii) g(s) converges absolutely foro > 1+ a (some a@ > 0); 
‘ —s/2 a( 1255 l—s 
(iv) wD (s/2) f(s) =m 8 2°T oe g(1—s). (2.2) 


The conclusion: f(s) = g(s) = const.f(s). 

The two ways in which Hecke rephrased the hypotheses (2.2) can be read in condition 
(2.2, iv). In Hecke the two Dirichlet series f(s), g(s) have been replaced by the single g(s). 
However, this apparent loss of generality is not significant since (2.2, iv) implies immediately 
that Ry(s) = w~8/°P'(s/2){ f(s) +g(s)} and Ra(s) = 2‘-8/ P(s/2){ f(s) — g(s)} satisfy, 
respectively, R}(s) = R,(1 — s) and Ro(s) = —Ro(1 — 5), functional equations with the 
same Dirichlet series on both sides. 

Hecke’s second change in the functional equation amounts to a replacement of s by 2s 
in (2.2, iv), which then becomes 


l 


x *T(s) f(s) = n-2-9P (; — ) 2 (; — 9) , (2.3) 
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where f(s) = f(2s) and g(s) = g(2s). With f =g=qgand R(s) = x *T(s)g(s), (2.3) 
reduces to the Hecke formulation R(s) = RG — s). That g(s) = f (s) = f(2s) accounts 
for the condition 3(a) in Hecke’s description of Hamburger’s version, the condition which, 
in his own formulation, he replaces by 3(b), the restriction on the poles of p(s). 


2. Outline of the Proof. Siegel’s proof is considerably more direct than that of 
Hamburger, using only the familiar formulae 


l 1+i00 
ee? = — y “T(s)ds,y >0 (2.4) 
21i J\—i00 

(the inverse Mellin transform of I'(s) is e~”) and 
[ paix BP fx AX es VE |-2ab 
0 Jx a 
fora > 0, b > 0 (evaluation of the Bessel integral). The first step is the simple observation 

that (2.2, iv) implies S$; = S2 for y > O, where 


(2.5) 


l 2+100 
Ss) = = FOUG 2 yas, 
211 J2~i00 
1 2+100 fice: - 
lL J2-i00 


From (2.4) and an easily-justified interchange of sum and integral it follows that 
= 2 
Sp?) ae (2.6) 
n=! 
Thus Sj is the exponential series 
oe - 
2 ane - Imz>0, 
n=l 


evaluated on the positive imaginary axis z = iy, y > 0. Itis this series that must be proved 
equal to a(#(z) — 1), witha € C. (This equation is equivalent to: a, = @ for all n > 1. 
The same is true of the equation f(s) = a@f(s).) 

The conditions (2.2, i, ii), the functional equation (2.2, iv), Stirling’s formula and the 
Phragmen-Lindelof principle combine to give an estimate on the growth of g(1 — s) in 
the vertical strip —a — 1 < o < 2, an estimate sufficiently strong to make possible an 
application of the residue theorem to S2 in the infinite strip -~a —1 <o <2, \r| >To >0 
(To sufficiently large). This yields 


Saint Oe l-—s (ies) /2 2 
5S, = —— l—s)P | —— Ja 7 ?y *!*ds + RQ), 2.7 
ag ee ( ; ) y YR), (2.7) 


v=1 


where the R,,(y) are the residues of the integrand at the poles in the region —a—1 <o <2. 
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The Dirichlet series representation (2.1) of g(s) and (2.4) together show that the integral 
in (2.7) is equal to 


2 — b —mn*/y 0 
— ) e “,y>Q0. 
Jy : 


On the other hand, calculation of the residues of the integrand leads to 


n=1 


YR) = Diy Qylog y) = QV), (2.8) 
v= 


v=1 


where the Q, are polynomials and the s,, are the poles of the integrand in —a—1 <o < 2. 
(The conditions (2.2, ii) and (2.2, iv) imply that Re s, < 2 — 6, for 1 < v < m.) Then, 
S; = S2, (2.6) and (2.7) together yield the fundamental transformation property 


se 2 Da 2 
253% ane" Y = —= Do bye" + Q(y). (2.9) 
y 


n=1 n=1 


At this juncture Siegel introduces another integral transform, multiplying both sides of 
(2.9) by e~” ty? with fixed t > 0, and integrating on y from 0 to oo. Absolute convergence 
justifies termwise integration on both sides of (2.9); the application of (2.5) on the right-hand 
side leads to 


—ntH(t) = 20 Y° bye 7™™, 2.10 
Yan (oe + a) - arte) = 20 Yo ne 2.10) 


n=1 n=l 


where H(t) = )o)_, t'»-* H, (log t), with H, a polynomial. Since the identity (2.10) 
clearly extends to the half-plane Re t > O, Siegel can exploit the periodicity of the nght- 
hand side of (2.10) and the singularities of the left-hand side, to conclude that a, = a, for 
all n > 1, thus that f(s) = a,C(s). This completes the proof. 


Remark: If we put 


OO OO 
F(z) tat 2. ane G(z) = a ae, 
n=1 


n=1 
then (2.9) becomes 
F(z) = (2/i)'?G(-1/z) + Q(2/i), (2.11) 
on the positive imaginary axis: z = iy, y > 0. By the principle of analytic continuation, 
(2.11) holds in all of 1. Thus, in H, K(z) = F(z) + G(z) satisfies 
K(z) = (2/i) 7? KK (-1/2) + Q(z/i) — (2/i) 7" Q(i/2). (2.12) 


With Q(z/i) of the form (2.8), the same is true of (z/i)~!/* Q(i/z), so that Q(z/i) — 
(z2/i) "OW /z) is a “log-polynomial sum.” Consequently, K(z) is a “modular integral 
with log-polynomial period function.” (See §IV, 3 below.) 
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Ill Hecke’s Proof 


The beginning of the Hecke proof in essence follows that of Siegel, modified only to take 
account of the modified hypotheses. In place of the functional equation (2.2, iv) linking the 
Dirichlet series f and g, Hecke assumes 


xT (s)o(s) = 27 2-9P (; = s) y (; s 9) (3.1) 
2 2 | 


with y a Dirichlet series: y(s) = )(°2., cnn7*. In place of Siegel’s assumption (2.2, i), 
which permits f to have an arbitrary finite number of poles in the s-plane, Hecke imposes 
the condition that (s — 5)p(s) can be continued to an entire function of finite genus. That is 
to say, in Hecke’s formulation the polynomial P(s) = s — 1/2; this is what Hecke actually 
intends in his condition 3(b). (See §I, above.) 

Siegel’s technique, employing the inverse Mellin transform, is equally effective under 
Hecke’s modified assumptions. Using this procedure, Hecke obtains, in place of Siegel’s 
transformation law (2.9), 


fore) 1 fore) 

—Tny __ —an/y 
) Cyne =—— ) Cyne : (3.2) 
n=0 VY imo 


where c,,n > 1, are the coefficients in the expression of g(s) as a Dirichlet series and 
co = Res;—1/2 g(s). (Comparison shows obvious changes from (2.9), upon which we shall 
comment shortly.) 

Putting L(z) = )-°..9 cne™'"* permits us to rewrite (3.2) as 


Lz) = FEC. Rea). y > 0, (3.3) 


Since g(s) is assumed to converge in some right half-plane, the coefficients c, have at worst 
polynomial growth in n. This implies that L(z) is holomorphic in 1, so (3.3) holds in all 
of H. The definition of L(z) shows further that L(z + 2) = L(z), and this, together with 
(3.3) and the growth condition on the coefficients c,, implies that L(z) is an entire modular 
form of weight 1/2 on I’y, the subgroup of index 3 in SL(2, Z) generated by z — z+2 and 
z — —I1/z. (See §I1V.2, below. Recall that the full modular group SZ(2, Z) is generated 
by z—> z+ landz —> —1/z.) 

Since L(z) has precisely the same transformation properties with respect to the generators 
of I°y as does the Jacobi -function (1.1) (see (4.1), below), Hecke can complete his proof 
simply by comparing L(z) with #(z). It turns out that L(z)/i(z) is entire modular function 
(i.e. modular form of weight 0) on I’y, and thus a constant. But L(z) = const. #(z) leads 
directly to y(s) = const.¢(s). This concludes the proof. 


Remarks: The transformation law (3.3) differs from (2.12) in two essential respects: 


(1) The series defining K(z) = F(z) + G(z) is supported on integral squares, while the 
exponents in the series defining L(z) are linear in n. 
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(11) The “period function” Q(z/i) — (z/i)~'/* Q(i/z) appearing on the right-hand side 
of (2.12) is not present in (3.3) 


That K(z) is not a priori a modular form on [’y, but only a modular integral with period 
function, makes Hecke’s proof unavailable under the conditions Siegel imposes (notwith- 
standing that, as a consequence of his proof of f(s) = g(s) = a,f(s), K(z) = 2a;(0(z) — 
1), so K(z) + 2a, is indeed a modular form). 

Equally, Siegel’s proof fails if applied to Hecke’s case. For, multiplying both sides of 
(3.2) by e77Y and integrating on y from 0 to oo yields 


OOo OO 

Cnt see —2ntJ/n 
Lei a oe 
n=0 n=0 


in place of (2.10). Siegel’s derivation of Hamburger’s theorem from (2.10) breaks down if 
we start instead with (3.4). 


IV Modular Integrals and the Riemann-Hecke-Bochner Correspondence 


The term “modular integral’ has arisen several times above, most prominently in the dis- 
cussion of Siegel’s proof of Hamburger’s theorem (§I]). In this section we define the notion, 
important here for its role in explaining the origin of the Abundance Principle for Dirichlet 
Series (§]). 
The ideas we shall introduce apply to the entire class of finitely generated discrete groups 
I acting on H, of finite or infinite hyperbolic area. However, for the most part we restrict 
the discussion to F = Ty, the subgroup of SL(2, Z) generated by z > z+ 2,z — —1/z. 
(See §III, above.) The group I’y is so called because of its connection with the Jacobi 
function defined by (1.1): #(z) is an entire modular form of weight 5 on I‘y. That is, for z 
in H, 
D(z + 2) = B(z), O(-1/2) = ez! 9(2), (4.1) 


and 3 (z) has bounded behaviour at the two parabolic points of a fundamental region FR for 
Ty. (R can be so chosen that the two parabolic points are i oo and —1. The expansion (1.1) 
expresses the behaviour at i00; there is a similar expansion at —1. See [10, Theorem 13, 46].) 

1. Multiplier systems and period cocycles for [.y. Let k be a real number and v a 
“multiplier system” for the weight k and the group I'y. This means that v is a function on 
the group I‘y - thought of as a matrix group - such that 


(i) |v(M)| = 1 for all M in Ty; 
(ii) v(M3)(c3z + d3)* = v(M1)v(M2)(c1M2z +. d1)*(c22 +.d2)*. (4.2) 


The identity (4.2, ii) is required to hold for all z in H and M1, M2 inT’ys, with M3 = MM? 

* OK 
and M; = Cj d; 
on the matrix group .y when k € Z, and a character on the linear fractional transformation 
group I°y when k € 2Z. 


, 1 <i < 3. Itis not too hard to show from (4.2, 11) that v is a character 
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With k real, v a fixed multiplier system in weight k for y and f a function defined on 
H, we introduce the stroke (or slash) operator 


(fF |2M)(z) = B(M)(cz + d)~* f (Mz), M = ( 3 ) Dy. (4.3) 


With this notation, condition (4.2, 11) 1s equivalent to 
f|pMiM2 = (f|2Mi)|2M. (4.4) 


for M,, M2 in Ty and any f defined on H. 
Suppose f is a function holomorphic in H; define the functions gy(z) = q(M; 2z), 
M €T 4, as follows: 


D(M)(cz + d)~* f (Mz) = f(z) + qu(2), (4.5, i) 
with M = ( A ) € I'y. This can be rewritten as 


f|M=f+qu.M €Ty». (4.5, ii) 


(Note the abbreviation f|M for f|;M.) The gy are called the period functions of f relative 
to {I'y, k, v}. From (4.4) the cocycle condition for {qy|M € Ts} follows directly: 


qmn = qm|N+qn, for M,N,€T». (4.6) 


A collection of functions {gy|M € Iy} satisfying (4.6) is called a cocyle for (or relative 
to) {l'9,k, v}. 

The condition (4.5) above does not restrict f in any way, since gy is defined to be 
f|M — f. To construct a meaningful theory we impose restrictions upon the gy, suited to 
the purpose at hand. In the present context it is essential to assume that the gy lie in P, the 
collection of all functions f holomorphic in H, subject to the growth condition 


If (2 < K(zl* +y-*), y =Imz > 0, (4.7) 


for some constants K,a,B > 0. Note that P is an algebra over the complex field C; 
moreover, it is preserved under differentiation, integration and the stroke operator (4.3), 
with M € SL(2, R). 

2. Modular integrals. Assume that {gy} is a cocycle relative to {[y», k, v} such that 
qm © P, forall M € I'y. Suppose that f is holomorphic in 1 and satisfies (4.5). Standard 
arguments using (4.5) imply the existence of “Fourier expansions” for f at the parabolic 


cusps 100 and —1, of Ty. (See, for example, [10, Chapter 2, 17—23].) Let S = ( os ) 


0 1 
0 —1 


and T = ( § 0 


) . The expansion at joo has the form 


OO 
f(z) = po(z) + > Genie. Im z>0, (4.8a) 


n=—Oo 
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with complex coefficients a,. Here po is a function in P determined by q(S*; z) and k 
derives from the multiplier system v : v(S*) = e77!*,0 < « < 1. The expansion at —1 
has the form 


f(z) =pizv+ +" Yo bpexp (-27i(n + w)/(2+ VD}, (4.8b) 


n=—OoO 


with complex b, and p; € P determined by q(S~*T; z); wis defined by v(S~2T) = e@t ek, 
O<p<l. 

Assume that the expansions (4.8) are both left-finite. Then we call f a modular integral 
with respect to T'», of weight k and multiplier system v, with period functions (or period 
cocycle) {qu|M € Ty}. If no a,, b, withn < 0 occur in the expansions (4.8) we call f an 
entire modular integral. If gy = 0 for all M in Ty, f is a modular form (entire modular 
form) rather than a modular integral (entire modular integral). 

In §1V.4 we shall introduce the Riemann-Hecke-Bochner (R-H-B) correspondence, which 
plays an essential role in the derivation of the Abundance Principle (§I, above). To gain a 
measure of flexibility we state the correspondence not merely for I’y, but for the entire class 
of ‘““Hecke triangle groups”. With 2 > O let S, = ( : ' ) (hence S} = S = ( : ) and 
Sy = S*). Then the Hecke group G, is defined by 


Gy. S45 TL). (4.9) 


(Note that G; = (1) = SL(2, Z) and G2 = Ty.) 

Hecke [7] has shown that Gy, is discrete if and only if (1) A > 2 or (11) A = 2 cos 7/p, with 
p € Z, p > 3. When A > 2, G, has the single relation T* = I; when 4 = 2 cosz/p, Gy 
has the two relations T* = (S,T)? = I. When A < 2, G, has a fundamental region of 
finite hyperbolic area. For 1 > 2, by contrast, this area is infinite. 

One can define “period cocyle for G,” and “(entire) modular integral” by analogy with 
the definitions given above for .y(A = 2). For general 4 > O, the expansion (4.84) 1s 
replaced by 


o,@) 
f(z) = poz) + > ane? "*, Im z > 0. (4.10) 


—OO 


If A # 2, the point —1 is not a parabolic cusp, so there is no analogue of (4.8b). The 
definitions of “modular integral’’ and “entire modular integral” of course entail the same 
restrictions on the expansion (4.10) as in the case A = 2. 

3. Modular integrals with log-polynomial period. Our application to the Abundance 
Principle for Dirichlet series requires period cocycles {qy|M € G,} satisfying conditions 
far more stringent than gy € P. To describe these conditions we introduce /og-polynomial 
sums, functions of the form 


qzy= >> 2% Yo) BCI, t)(logz)’, (4.11) 


I<j<J 0<t<M(j) 
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where the exponents @; and the coefficients B(j, t) are arbitrary complex numbers. The f 
are nonnegative integers. Note that a log-polynomial sum is holomorphic in C\{iy|y < 0}; 
in particular, g(z) defined by (4.11) is holomorphic in H. 

We say that the modular integral f on I’y (respectively, G,) has log-polynomial period, 
provided g(S7; z) = 0 (respectively, g(S,; z) = 0) and q(T; z) is a log-polynomial sum, 
for the generators S* (respectively, S,) and 7, in the modular transformation law for f 
((4.5) or its analogue for G,). Note that q(T; z*) and q(S*; z)(q(S,; Z)) are in P. Since 
ly = (S*,T) (respectively, G, = (S,,7)), it follows that gy € P for all M € Ty 
(respectively, M € G)), by the closure properties of P and the cocycle condition (4.6) (or 
its analogue for G,). The cocycle condition implies as well that gr|T + qr = 0, since 
eo 

The significance for us of modular integrals with log-polynomial period is this: 

By the Riemann-Hecke-Bochner correspondence (§1V.4, below), the Abundance Princi- 

ple for Dirichlet Series with Functional Equation is equivalent to the existence of infinitely 

many linearly independent entire modular integrals on Ty, with log-polynomial period. 


4. The Riemann-Hecke-Bochner correspondence. Before stating the correspondence 
we make a few observations about the expansion (4.8a). If f is an entire modular integral 
on I’y, then by our definition the expansion has the form 


o,@) 
(2) = poz) + Dane Ot, 
n=0 


If, in addition, g(S?, z) = O(asis the case when f has log-polynomial period), the expansion 
has the form f(z) = °° 9p ane™ tz, 

We note further that the only relation in Py, 77 = I, does not involve the generator 
S*. Thus, in any weight k we can determine a multiplier system v on 'y by choosing 
v(S*) = e?™'*, with 0 < « < 1, but « otherwise arbitrary, and putting v(7) = C, with C 
chosen to respect the relation T* = I. By (4.2, i1), this means that v(T) has one of the four 
values +e—7!*/? | + je—7!k/2, however, only the two values +e—7'*/* give rise to nontrivial 
modular forms or modular integrals on Ty. Now, given M in 'y we write M as a word 
in S* and T, and determine v(M) from v(S7), v(T) by applying the consistency condition 
(4.2, 11). 

In the statement of the correspondence we choose k = O,so v(S*) = 1. Then the 
expansion at ico of an entire modular integral f on I’y assumes the form 


OO 


f(z) =) ane™'", Imz > 0. (4.12) 
n=0 


R-H-B Correspondence. Let k be real and C complex. Suppose F(z) is holomorphic 
in the upper half-plane 11, defined there by an exponential series of the form (4.12), where 
the complex coefficients a, satisfy the polynomial growth condition 


An = O(n”), y > O,n — ow. (4.13) 
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Let ®(s) be the Mellin transform of F (iy) — ao: 


ore) d oe. 
(5) = [LF Gy) — any’ = 2766) So ann 


n=1 
Then, we have the following. 


1. The assertions (A) and (B) are equivalent: 


(A) F(z) is an entire modular integral on 'y, with log-polynomial period, of weight 
k and multiplier system v such that v(S*) = 1, 0(T) =C. 

(B) ®can be continued analytically into the entire s-plane, except for a finite number 
of poles. Furthermore, ® is bounded in each lacunary vertical strip, 


o, < Res <o2,|Ims|>%>O0, 1% sufficiently large, 
and satisfies the functional equation 
O(k —s) = e™*/*CH(s). (4.14) 
2. Consider the modular relation included in condition (A): 
2“ F(-I/z) = CF(z) + 4r(2), (4.15) 


with q7(z) of the form (4.11). The term Bz% (log z)'(B 4 0,t € Z,t > 0) occurs in 
qr(z) if and only if ®(s) has poles of order tf + 1 ats = a+k ands = —a. The 
only possible further singularities of ® are simple poles at s = 0 ands = k. 


Remarks: (i) By (4.14), C = +e~7/*/*, (See the discussion immediately preceding the 
statement of the R-H-B correspondence. ) 

(11) The correspondence, as stated here, differs somewhat from Bochner’s original formu- 
lation in [1], which deals with generalized Dirichlet series rather than the ordinary Dirichlet 
series we have here; Bochner allows as well two distinct exponential series in the modular 
relation. Furthermore, Bochner’s period function q7(z) in (4.15) is a “residual function” 
(in his terminology) rather than a sum of the form (4.11). (See also [3].) However, the 
sums (4.11) are residual in Bochner’s sense, and a residual function appearing as a period 
function in a modular relation (4.15) necessarily has the form (4.11). 


V The Abundance Principle for Dirichlet Series 
with Functional Equation 


1. Detailed statement of results. In the Introduction we have stated the Abundance 
Principle in general terms, without explanatory details. We provide those now. Recall 
the conditions 1, 2, 3 of §I: Let p(s) be an analytic function and put R(s) = m-*T'(s)g(s). 
Assume the following concerning g(s) and R(s). 


1. There exists a polynomial P(s) such that P(s)g(s) is an entire function of finite 
genus. 
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2. R(s) = R(z —S). 
3. g(s) can be expanded in a Dirichlet series convergent in some right half-plane. 


Theorem 1 Let oo be a real number > i. Let A(o9) be the space of rational functions A(s) 
with poles restricted to the strip 5 —oo < Res < og and satisfying the symmetry condition 
A(5 —s) = A(s). Let Ay(o0) be the subspace of A in A(oq) such that R(s) — A(s) is 
entire for some R(s) = x ‘TV (s)g(s) satisfying 1, 2 and 3. Then for A,,..., An € A(o0), 
with n > [Z +- 3] + 2, some nontrivial linear combination of A,,..., An is in Ay (09). 


Theorem | can be generalized to 


Theorem 2 (Weight k Abundance Principle). Let k be an arbitrary real number. For 
09 = k/2, let A(oo; k) denote the space of rational functions A(s) with poles restricted to 
the strip k — 09 < Res < 09 and such that A(k — s) = A(s). Replace condition 2 by the 
functional equation 

R(s) = R({k — Ss). (2;) 


Let Ay (00; k) be the subspace of A in A(oo; k) such that R(s) — A(s) is entire for some 
R(s) = mx ‘TV (s)g(s) satisfying 1,2, and 3. Then for Aj,...,An € A(oo;k), with 
n > N(oo,k), some nontrivial linear combination of Aj, ..., An lies in Ay (00; k). Here, 
N (00, k) is an explicit constant dependent only upon oo and k. 


For k > 2, Theorem 2 can be strengthened considerably, to 


Theorem 3 (Mittag-Leffler Principle). Let k > 2 and let A(s) be any rational function 
satisfying A(k —s) = A(s). Then there exists p(s) such that R(s) = n-°T(s)@(s) satisfies 
1, 2, and 3, and such that R(s) — A(s) is entire. 


Finally, we can extend all of these results to the case in which 2, is replaced by the 
functional equation 


R(s) = —R(k — 5). (4x) 


Theorem 4 (a) For k an arbitrary real number and og > k/2, let B(o9; k) be the space 
of rational functions A(s) with poles restricted to the strip k — og < Res < 09 and 
satisfying A(k — s) = —A(s). Let By (00; k) denote the subspace of A in B(oo; k) such 
that R(s) — A(s) is entire for some R(s) = a *V(s)Q(s) satisfying 1,3 and 44. Then 


for Aj,..., An in Blog; k) with n > M(o0,k), some nontrivial linear combination of 
A\,..., An lies in By(00; k). Here M(oo,k) is an explicit constant determined by oo 
and k. 


(b) Letk > 2 and suppose A(s) isa rational function with A(k —s) = —A(s). Then there 
exists p(s) such that R(s) = nm *V(s)g(s) satisfies 1,3 and 4,, and such that R(s) — A(s) 
is entire. 


Theorem | is to be found in [12, 362, Theorem 1]. Theorem 3 appeared as Theorem 2 
of the same article (362-3). 
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2. Brief discussion of the proofs. The first step is the observation that the abundance 
of Dirichlet series with a functional equation is equivalent, by the R-H-B correspondence, 
to the abundance of entire modular integrals on I'y, with log-polynomial period. In this 
equivalence the functional equation 2; is associated with modular integrals of weight k and 
multiplier system ve determined by 


vt (S?) = Lut (T) = eT MRP, (5.1) 


while 4; corresponds to modular integrals of weight k and multiplier system vu, deter- 
mined by . 
v, (S?) = 1, up (T) = eR? (5.2) 


The proofs, then, entail the construction of “many” modular integrals of fixed weight k on 
I», with log-polynomial period for both multiplier systems v;’, v; . 

The construction of these modular integrals relies upon Eichler’s “generalized Poincaré 
series” (GPS), first introduced in [4], and later developed and applied in [14,11,12,13]. To 
construct a GPS we must have, at the outset, a period cocycle {gy} on I'y, in weight k and 
connected with the appropriate multiplier system, either ve or v, in the present situation. 
We require further that g(S*; z) = 0 and that q(T; z) is a log-polynomial sum. (Recall the 
alternative notation g(M; z) for gy(z).) Starting with these two restrictions, we wish to 
generate {qu} by applying the cocycle condition (4.6). 

As it turns out, the necessary condition 


qr\iT +qr =0 (5.3) 


of §IV.3 is sufficient for this process to yield a well-defined period cocycle {qy}. Initially, 
let g be any log-polynomial sum whatsoever and put gr = q|T — q, which is then a log- 
polynomial sum satisfying (5.3). We now define a period cocycle in 'y by writing M € Ty 
as a word in S? and T and then applying (4.6) several times to define qy. While the relation 
T? = I gives rise to the complication that M is not given uniquely as a word in S? and T, 
the restriction (5.3) on gr guarantees the uniqueness of gy defined in this manner. 

Next, let m be a positive even integer. With {ga} in hand, define Eichler’s generalized 
poincaré series V({qmu}; m; Zz) = V(z) by 


W(z) = > qv(z(ez +d): (5.4) 
V 


the summation is on all V = ( acid ) € Ty with distinct lower rows. (The condition 


q(S*; z) = 0 ensures that the individual terms of the series depend only upon the lower 
row c,d of V € Ty.) Eichler shows that, for sufficiently large m, the series (5.4) converges 
absolutely for z in H, and uniformly on compact subsets of 1 [4]. Eichler’s proof assumes 
that the gy are polynomials, but a slight elaboration of his method establishes convergence 
for the more general case gy € P, V € Ty [11, 615-619]. Since we began with qr a log- 
polynomial sum, it follows that gy € P for all V in I’y, so the proof applies here. 

As a consequence of absolute convergence, V(z) has the transformation property 


(Y|M)(z) = (vz + 8)" U(z) — (vz +. 8)" Em (2) gu (2), (5.5) 
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* OK 


for M = (* 5 


) € Ty, where E,,(z) is the familiar Eisenstein series of weight m, 
En@yes >) Kenta)” (5.6) 
V 


with summation conditions as in (5.4). Since m is an even integer, E,,(z) does not 
vanish identically, so we may form the quotient F(z) = —W(z)/Em(z). By (5.5) and 
the well-known modular transformation properties of E,,(z), F(z) has the formal behavior 
of a modular integral relative to (Ty, k, v+), with the preassigned cocycle {qj} of period 
functions: 

F\IM=F+qm, MeT~,. (5.7) 


On the other hand, F may not be an entire modular integral (or a modular integral at all) 
in the sense of §IV.2, since the zeros of E,,(z) in 7H may well be poles of F. There is the 
further issue that F may not behave appropriately at the parabolic cusps, —1 and i ov, of 
I'y. The proofs of the Theorems are largely procedures for modifying F to obtain entire 
modular integrals for {I'y, k, vt}. 

In all of the proofs the key point is this: the log-polynomial sum qr is completely arbitrary 
except for the restriction (5.3). The proof of Theorem 4 employs the multiplier system vu, , 
given in (5.2), while the proofs of Theorems 1-3 utilize one characterized by (5.1). This 
is the only distinction. The proofs of those results labelled “Mittag-Leffler” rely upon a 
“Mittag-Leffler” theorem for modular forms of weights k > 2. This theorem fails in weights 
k < 2. For detailed proofs see [12]. 


VI Conclusion 


1. Extension to other Hecke groups. All of the above results rely upon considerations 
regarding the particular Hecke triangle group ['y = (S*, 7). There is a modification of 
these results, in which the function R(s) = 2 *T(s)g(s) is replaced by 


2 
Ry(s) = (+ P(s)gy,(s), A > 2, (6.1) 


with g,(s) a Dirichlet series convergent in some right half-plane. (In the notation of (6.1), 
R(s) = R2(s).) Bringing to bear the Hecke triangle groups G, defined in (4.9) with A > 2, 
leads to 


Theorem 5 (Mittag-Leffler) (a) Let k be an arbitrary real number and A(s) any rational 
function such that A(k — s) = A(s). Then there exists @)(s) such that R)(s), defined in 
(6.1), satisfies 1,2, and 3, and such that R)(s) — A(s) is entire. 

(b) The same, with A(k — s) = —A(s) and 2, replaced by Ax. 


The proof, based in part upon a Mittag-Leffler theorem for modular forms on G) of all 
real weights, can be found in [13]. 

2. Zeros. Ever since Riemann’s path-breaking work on ¢(s) [16], there has been a great 
deal of interest in the zeros of Dirichlet series, especially those with Euler product. While 
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we can assert nothing about Euler products for the Dirichlet series constructed here, the 
doctoral dissertation of Lekkerkerker [15] contains results yielding information about the 
distribution of their zeros in the plane. These generalize familiar results concerning the zeros 
of Riemann’s ¢(s). 

Specifically, Theorem 4 in [15, chapter II, 24] immediately implies the following for the 
Dirichlet series g(s) occurring in our Theorems 1-5: Let R be a rectangle of the form 


|Re s —k/2| <a, |Ims| < B, 


with a, B real, and such that p(s) is holomorphic in the complement of R. Let N\(T) 
(N2(T)) denote the number of zeros so of y(s) in the complement of Rk such that0 < 
Im sg < T(—T < Im 5o < 0). Then, 


I 
Nj(T) = T logT + dpT + O(log7). 


T — oo fori = | and 2. Here dg is a real number determined by ¢(s). 

Chapter IV of [15] presents results concerning the zeros on the “critical line” (in our 
notation, Re s = k/2) of Dirichlet series with functional equations, but there are technical 
difficulties in applying them to the Dirichlet series constructed here. This matter bears 
further investigation. 


Notes 


1. (p. 1) Hecke’s words are “g(s/2) soll in eine irgendwo konvergente Dirichlet-Reihe entwick- 
elbar sein” (emphasis added). However Hamburger’s proof [5] requires that g(s/2) be absolutely 
convergent for Re s > 1. Siegel [17] relaxes this condition, assuming convergence for Re s > 
2—6,0 > 0. To my knowledge, the proof of the stronger formulation of Hamburger’s version, 
asserted by Hecke, did not appear until 1956 [2, Theorem 7.1]. 

2. (p. 1) Hecke assumes that the pole at s = 1/2 1s simple. 

3. (p. 3) In hindsight, Hamburger’s original proof [5] does suggest these ideas, but they are not 
actually present. 
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A Survey of Number Theory and Cryptography 


Neal Koblitz 


Our purpose is to give an overview of the applications of number theory to public-key cryptography. We 
conclude by describing some tantalizing unsolved problems of number theory that turn out to have a bearing 
on the security of certain cryptosystems. 


Key-words. Public-Key Cryptography, Primality, Factorization, Discrete Logarithm, 
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1 Introduction 


Cryptography, broadly defined, is the science that studies a wide range of issues in the 
transmission and safeguarding of information. The increasing importance of cryptography 
in the “information age” and the concomitant flourishing of cryptographic research have had 
a profound impact on number theory, causing certain algebraic, analytic, and computational 
questions unexpectedly to acquire a practical urgency. Most of the applications of number 
theory have arisen since 1976 as a result of the development of public-key cryptography. 

This survey is devoted to the questions that one encounters in the study of public-key 
cryptosystems. After discussing the idea of public-key cryptography and its importance, we 
next describe certain prototypical public-key constructions that use number theory. Then 
we survey the progress that has been made in the three number-theoretic problems that are 
at the heart of these constructions — primality testing, factorization of integers, and discrete 
logarithms in a finite field. We also treat elliptic curve cryptosystems and the corresponding 
discrete logarithm problem. We conclude by describing some difficult unsolved problems 
of number theory that have cryptographic significance. 


2 Early History 


A cryptosystem for message transmission means a map from units of ordinary text called 
plaintext message units (each consisting of a letter or block of letters) to units of coded text 
called ciphertext message units. The idea of using arithmetic operations to construct such 
a map goes back at least to the Romans. In modern terminology, they used the operation of 
addition modulo N, where N is the number of letters in the alphabet, which we suppose has 
been put in one-to-one correspondence with Z/ NZ. For example, if N = 26 (i.e., messages 
are in the usual Latin alphabet, with no additional symbols for punctuation, numerals, capital 
letters, etc.), the Romans might encipher single letter message units according to the formula 
C = P +3 (mod 26). This means that we replace each plaintext letter by the letter three 
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positions farther down the alphabet (with the convention that X +> A, Y +> B, Zt+> C). 
It is not hard to see that the Roman system — or in fact any cryptosystem based on a 
permutation of single letter message units — is easy to break. 

In the 16th Century, the French cryptographer Vigenere invented a variant on the Roman 
system that is not quite so easy to break. He took a message unit to be a block of k letters — 
in modern terminology, a k-vector over Z/NZ. He then shifted each block by a “code 
word” of length k; in other words, his map from plaintext to ciphertext message units was 
translation of (Z/NZ)* by a fixed vector. 

For the most part, until about 20 years ago only rather elementary number theory was 
used in cryptography. The first mention in print of a new type of cryptography seems to have 
been a passage in a book about time-sharing systems that was published in 1968 (p. 91-92 
of {103]). In it, the author describes a new one-way cipher used by R. M. Needham in 
order to make it possible for a computer to verify passwords without storing information 
that could be used by an intruder to impersonate a legitimate user. 


In Needham’s system, when the user first sets his password, or whenever he changes it, 
it is immediately subjected to the enciphering process, and it is the enciphered form that 
is stored in the computer. Whenever the password is typed in response to a demand from 
the supervisor for the user’s identity to be established, it is again enciphered and the 
result compared with the stored version. It would be of no immediate use to a would-be 
malefactor to obtain a copy of the list of enciphered passwords, since he would have 
to decipher them before he could use them. For this purpose, he would need access 
to a computer and even if full details of the enciphering algorithm were available, the 
deciphering process would take a long time. 


In [89] the first detailed description of such a one-way function was published. The 
original passwords and their enciphered forms are regarded as integers modulo a large 
prime p, and the “one-way” map F,, — F, is given by a polynomial f(x) which is not 
hard to evaluate by computer but which takes an unreasonably long time to invert. Purdy 
used p = 204 _ 59, Sys yeti + a,x2 +3 + ayx? + a3x* + a4x +s, where the 
coefficients a; were arbitrary 19-digit integers. 


3 The Idea of Public-Key Cryptography 


Until 1976, all cryptographic message transmission was by what can be called private key. 
This means that someone who has enough information to encrypt messages automatically 
has enough information to decipher messages as well. As a result, any two users of the 
system who want to communicate secretly must have exchanged keys in a safe way, e.g., 
using a trusted courier. 

The face of cryptography was radically altered when Diffie and Hellman invented an 
entirely new type of cryptography, called public-key [28]. At the heart of this concept is 
the idea of using a one-way function for encryption. Speaking informally, we can define a 
public-key encryption function (also called a “trapdoor” function) as a map from plaintext 
message units to ciphertext message units that can be easily computed by anyone having 
the so-called “public” key but whose inverse function (which deciphers the ciphertext 
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message units) cannot be computed in a reasonable amount of time without some additional 
information (the “private” key). This means that everyone can send a message to a given user 
using the same enciphering key, which they simply look up in a public directory. There is 
no need for the sender to have made any prior secret arrangement with the recipient; indeed, 
the recipient need never have had any prior contact with the sender at all. 

It was the invention of public-key cryptography that led to a dramatic expansion of the 
role of number theory in cryptography. The reason is that number theory seems to be the 
best source of one-way functions. Later we shall discuss the most important examples. 


3.1 Tasks for Public-Key Cryptography 
The most common purposes for which public-key cryptography has been applied are: 


(1) confidential message transmission; 

(2) authentication (verification that the message was sent by the person claimed and that 
it has not been tampered with), often using hash functions and digital signatures; 
password and identification systems (proving authorization to have access to data or 
a facility, or proving that you are who you claim to be); non-repudiation (guarding 
against people claiming not to have agreed to something that they really agreed to); 

(3) key exchange, where two people using the open airwaves want to agree upon a secret 
key for use in some private-key cryptosystem; 

(4) coin flip (also called bit commitment), for example, two chess players in different 
cities want to determine by telephone (or e-mail) who plays white; 

(5) secret sharing, where some secret information (such as the password to launch a 
missile) must be available to k subordinates working together but not tok — 1 of them; 

(6) zero knowledge proof, where you want to convince someone that you have suc- 
cessfully solved a number-theoretic or combinatorial problem (for example, you 
have found the square root of an integer modulo a large unfactored integer, or you 
have 3-colored a map) without conveying any knowledge whatsoever of what the 
solution is. 


3.2 Probabilistic Encryption 


Most of the number theory based cryptosystems for message transmission are deterministic, 
in the sense that a given plaintext will always be encrypted into the same ciphertext by 
anyone. However, deterministic encryption has two disadvantages: (1) if an eavesdropper 
knows that the plaintext message belongs to a small set (for example, the message is either 
“yes” or “no’’), then she can simply encrypt all possibilities in order to determine which is 
the supposedly secret message; and (2) it seems to be very difficult to prove anything about 
the security of a system if the encryption is deterministic. For these reasons, probabilistic 
encryption was introduced. We will not discuss this further or give examples in the present 
paper. For more information, see the fundamental papers on the subject [33, 34]. 

We will now discuss two examples of public-key cryptosystems — RSA [93] and Diffie— 
Hellman [28] — that have been at the center of research in public key cryptography. Both 
are connected with fundamental questions in number theory — integer factorization and 
discrete logarithms, respectively. Although the systems can be modified to perform most 
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or all of the six tasks listed above, we will describe a protocol for only one of these tasks 
in each case (message transmission in the case of RSA, and key exchange in the case of 
discrete log). 


4 The RSA Cryptosystem 


Suppose that we have a large number of users of our system, each of whom might want to 
send a secret message to any one of the other users. We shall assume that the message units 
m have been identified with integers in the range 0 < m < N. For example, a message 
might be a block of k letters in the Latin alphabet, regarded as an integer to the base 26 with 
the letters of the alphabet as digits; in that case N = 26*. In practice, in the RSA system 
N is anumber of between 200 and 300 decimal digits. 

Each user A (traditionally named Alice) selects two extremely large primes p and g whose 
product n is greater than N. Alice keeps the individual primes secret, but she publishes the 
value of n in a directory under her name. She also chooses at random an exponent e that 
must have no common factor with p — 1 or g — | (and presumably has the same order of 
magnitude as n), and publishes that value along with n in the directory. Thus, her public 
key is the pair (n, e). 

Suppose that another user B (Bob) wants to send Alice a message m. He looks up 
her public key in the directory, computes the least nonnegative residue of m® modulo n, 
and sends Alice this ciphertext value (which we denote c). Bob can perform the modular 
exponentiation c = m® (mod n) very rapidly (in O(log* n) bit operations), using the so- 
called ‘“‘repeated squaring method” (see §4.6.3 of [45]). 

To decipher the message, Alice uses her secret deciphering key d, which is any integer 
with the property that de = | (mod p — 1) and de = 1 (mod q — 1). She can find such 
a d easily by applying the extended Euclidean algorithm to the two numbers e and |.c.m. 
(p —1,q — 1). It is easy to check (using Fermat’s Little Theorem) that if Alice computes 
the least nonnegative residue of c“ modulo n, the result will be the original message m. 

What would prevent an unauthorized person C (Cynthia) from using the public key (n, e) 
to decipher the message? The problem for Cynthia is that without knowing the factors p 
and q of n there is apparently no way to find a deciphering exponent d that inverts the 
operation m +> m* (mod n). Nor does there seem to be any way of inverting the encryption 
other than through a deciphering exponent. Here I use the words “‘apparently” and “seem” 
because these assertions have not been proved. Thus, one can only say that apparently 
breaking the RSA cryptosystem is as hard as factoring n. 


5 Diffie-Hellman Key Exchange 


Another basic type of public-key cryptographic system is based on the discrete logarithm 
problem. First we define this problem. 

Let F7 denote the multiplicative group of the finite field of g elements. Let g € Fi bea 
fixed element (“base’’). The discrete log problem in KF’, to the base g is the problem, given 
x € F), of determining an integer y such that x = g” (if such y exists; otherwise, one must 
receive an output to the effect that x is not in the group generated by g). 
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The Diffie-Hellman key exchange works as follows. Suppose that Alice and Bob want 
to agree upon a large integer to serve as a key for some private-key cryptosystem. This 
must be done using open communication channels, i.e., any eavesdropper (Cynthia) knows 
everything Alice sends to Bob and everything Bob sends to Alice. Alice and Bob first agree 
on the field F, and a base element g. They also agree on a map from F7, to the positive 
integers; for example, in the case g = p itcould be the obvious map ia — {1,2,..., p—1}. 
All of this is agreed upon publicly, so that Cynthia also has this information at her disposal. 

Next, Alice and Bob choose random secret integers k4 and kg, respectively. Alice sends 
Bob the group element g*4 (but not her secret integer k 4), and Bob sends Alice the element 
gX® (but not the integer kg). The common key is then gk4*8 € F, (or rather, the integer 
associated to g*4*2 under the agreed upon correspondence). Alice determines this key by 
taking the element g*? received from Bob and raising it to the k4-th power; Bob determines 
the key by taking the element g*4 that he received from Alice and raising it to the kg-th 
power. But Cynthia is in the unfortunate position of having to determine the element g*4*8 
knowing only g*4 and g*® but not k, or kp. 

The problem facing Cynthia is the so-called Diffie-Hellman problem: Given g, g9*4, 
gb e F7,, find g*ake Tt is easy to see that anyone who can solve the discrete log problem 
in FE) can then immediately solve the Diffie-Hellman problem as well. The converse is not 
known. That is, it is conceivable (though generally thought to be unlikely) that someone 
could invent a way to solve the Diffie-Hellman problem without being able to find discrete 
logarithms. In other words, breaking the Diffie-Hellman key exchange has not been proved 
to be equivalent to solving the discrete log problem. But for practical purposes it is probably 
safe to assume that the Diffie-Hellman key exchange in a group G is secure provided that 
the discrete logarithm problem in G is intractable. See [14] for the latest partial results 
supporting the conjectured equivalence of the Diffie-Hellman and discrete log problems. 

Several related systems based on the discrete logarithm problem in F, have also been 
developed. Perhaps the most important is the Digital Signature Standard (DSS) proposed 
by the U.S. government as a public key signature method that could play the same role in 
standardizing digital signatures that DES played in standardizing the commercial use of 
private-key cryptography. 


6 Primality Testing 


In the implementation of several of the most popular cryptosystems (RSA, Diffie-Hellman 
in the multiplicative group of a prime field, etc.) one has to find large prime numbers, 
usually of 100 to 200 digits. Suppose that we generate a random 100-digit odd number n. 
By the Prime Number Theorem, n has approximately a | out of [15 chance of being prime. 
We then apply a fest of primality ton. If n turns out to be composite, we choose another 
random odd number (or perhaps simply replace n by n + 2). If n passes the primality test, 
then we are happy, and choose p = n to be our prime. 

Depending on how demanding we are, the term “primality test” means one of three things: 


(1) a sequence of easily verified tests of compositeness, each of which is a congruence 
that is likely (but not certain) to fail if 1 is composite; thus, if n satisfies all of these 
congruences, it is a probable prime; 
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(2) an efficient algorithm that tells us with certainty if n is prime; 

(3) an efficient algorithm that not only guarantees the primality of n but also gives us a 
certificate of primality, i.e., a list of number theoretic relations that can all be true 
only if n is prime and that can be verified very rapidly. 


We will discuss each type of primality test in turn. 


6.1 Compositeness Tests 


The simplest such test is trial division. Suppose, for example, that we divide a large random 
odd number n by all odd primes less than 100. If 1 is composite, then it is easy to see that 
there is less than a 25% probability that this trial division will fail to reveal a factor of n. 
However, trial division is practical only as a preliminary step to eliminate trivial composites. 

A more promising compositeness test is based on Fermat’s Little Theorem, according 
to which a”~! = 1 (mod n) for any a prime to n whenever n is prime. For any a, recall 
that a”—! can be computed modulo n in O(log? n) bit operations, i.e., very rapidly. If a is 
chosen at random, then most composite numbers n will fail the Fermat test. However, this 
test has a serious drawback in that there exist composite numbers n — called Carmichael 
numbers — for which a”~! = 1 (mod n) for all a prime to n. Carmichael numbers tend to 
be rather rare (the smallest is 561 = 3-11-17); however, it was recently proved that there 
are infinitely many Carmichael numbers (see [7]). 

There is a refined Fermat test due to Miller [75] that does not suffer from this defect 
and can be performed as rapidly as the a”~! test. Namely, the Miller test has the feature 
that no composite odd number n passes it for more than 25% of the choices of residue 
class a prime ton. (For most composite n, the chance of passing the Miller test is actually 
much less than 25%.) The Miller test proceeds as follows. Write n — 1 = 2°t, where t¢ is 
odd. For any given a prime to n, first compute a’ modulo n. If this is equal to +1, then n 
passes the test. If not, then repeatedly take the square modulo n, obtaining the residue of 


az’t J =1,2,...,s — 1. If any of these residues is —1 modulo n, then n passes the test. 
If not, then n fails the test. (Notice that a prime number n would pass the test, because: 
(i) a** =a"—'! = 1 (mod n), if n is prime; and (ii) —1 is the only square root of 1 modulo 
n other than | itself, if m is prime.) 

Applying the Miller test with 10 different random values of a gives — for all practical 
purposes — an infallible test that n is prime. In fact, n is almost certain to be prime if it 
passes the Miller test for just the four values a = 2, 3,5, 7; it turns out that there is only 
one odd composite number less than 25 - 10? that passes the Miller test for all four of these 
values of a (see [88]). 


6.2 Proofs of Primality 


If the generalized Riemann Hypothesis is true, then any composite number n must fail the 
Miller test for some a in the interval 1 < a < 2 log? n (see [75, 10]). Thus, applying the 
Miller test for all of these values of a — a procedure which takes only polynomial time, 
namely O(log? n) bit operations — will (under the generalized Riemann Hypothesis) give 
a proof either that n is prime or that it is composite. 
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In the early 1980s, Adleman, Pomerance, and Rumely [4] developed an elegant and 
efficient deterministic algorithm that gives a proof of primality using Gauss sums. Cohen 
and Lenstra [24] then improved upon this algorithm, by using Jacobi sums in place of Gauss 
sums. The idea was to perform a series of Fermat—like tests with Jacobi sums, working 1n the 
ring of integers of the cyclotomic field generated by the n-th roots of unity. The Jacobi sum 
test requires (log n)? 08 !08!08”) bit operations. Strictly speaking, the algorithm is not quite 
polynomial time; and the existence of an unconditional polynomial time deterministic proof 
of primality is still an open question. However, in practice the Jacobi sum algorithm is very 
fast in the 200-digit range. This shows that whether or not a given algorithm is polynomial 
time, while of great theoretical importance, is not always what determines its practical 
utility. For a detailed description of the algorithm and a discussion of implementation, see 
[23, 24]. 


6.3 Certificates of Primality 


Given a probable prime n, we now want not only to prove primality, but also to do it in such 
a way that the validity of our proof can be checked in far less time than it took to produce 
the proof. At present, the most efficient way to do this (first proposed in [32]) uses elliptic 
curves Y? = X3+aX +), where the coefficients a and b and the coordinates (x, y) of 
points on the curve are integers considered modulo n. (We shall need some basic facts 
about elliptic curves over finite fields; for an introduction to this subject, see [53], [100], 
or [43].) 

The point of departure is an analogue of the following special case of a theorem of 
Pocklington [81]: [fg > /n is a prime divisor of n — | and if there exists an integer a 
satisfying a"~' = | (mod n) and g.c.d.(a“~/4 — 1,n) = 1, thenn is prime. This result 
is easy to prove, by observing that if p is any prime divisor of n, then the assumptions 
imply that the element a‘"~!)/4 has exact order g modulo p; but then g|p — 1, and so 
p>q>vJn. 

Pocklington’s theorem can be of practical use in proving primality of certain special 
primes n, but it cannot be applied to prove primality of an arbitrary prime n (because n — | 
must be factored, and it must be divisible by a prime > ./n). On the other hand, the 
elliptic curve analogue of Pocklington’s theorem does in fact lead to a primality proof that 
is generally applicable. This analogue is as follows: Let E be an elliptic curve defined over 
the ring Z/nZ, and let m be an integer having a prime divisor q > (n'/4 4 1)?. If there 
exists a point P on E such thatm - P = O and (m/q) - P is defined and different from O 
(where O denotes the identity point of E), then n is prime. We make some remarks about 
how this statement is converted to a primality algorithm. 


1. The integer chosen as m has the property that, if 1 is prime, then m is the order of 
the group E. Both the group operation on E and the computation of m are performed 
under the supposition that n is prime; if the corresponding operations break down 
because of compositeness of n, then at the same time a nontrivial factor of n will be 
revealed. In practice, of course, this will never happen, since one goes to the trouble 
of proving primality only for a probable prime n (e.g., n has already passed the Miller 
test fora = 2,3,5,7). 
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2. If n is prime, then Hasse’s theorem says that the order of the group E is between 
n+i-—2./n andn+1+4+2/n. Ifn is not prime and if p|n is a prime < Jn, 
then the order of E mod p (the group of F,-points of E) is < p+1+42,/p < 
(n'/4 4 4y2, 

3. Multiplying a point on E mod n by a large integer such as m or m/q can be done very 
efficiently, using the “repeated doubling” method that is analogous to the “repeated 
squaring” method for exponentiation modulo n. 

4. The reason why the elliptic curve analogue of Pocklington — and not the original 
Pocklington theorem — can be used to test any n is that the number m that is anal- 
ogous to n — | varies as we vary the elliptic curve E. Instead of being stuck with 
n — 1, we thus have flexibility in arriving at a number m that is divisible by g > 
(n'/4 ae 1)*. 

5. Because of the difficulty of factoring m, in practice one looks for an m that is a small 
multiple of a prime gq, i.e., g will actually be much greater than (n!/4 + 1). 

6. The most time-consuming part of the algorithm is finding a curve E with suitable 
m. Once we find E and m (and also a point P with the desired property), then it 
takes much less time to check that with this choice the conditions of the theorem are 
satisfied, and so n is prime. 

7. The point P is selected atrandom. Once E is found with the desired property (namely, 
its order m is divisible by a large prime q), almost any point P on E has order dividing 
m but not m/q. 

8. In practice, q, like n itself, is known only to be a probable prime. After completing 
the algorithm for n, one has a proof of primality contingent upon q being prime. One 
then applies the procedure recursively to prove primality of q. 

9. To find a suitable E,, two methods have been proposed, respectively by Goldwasser— 
Kilian and by A.O.L. Atkin: (1) random selection of the coefficients of the defining 
equation, and then computation of m = #E using Schoof’s algorithm [96] (or one 
of the recent improvements — see the discussion below); (2) selection of a complex 
multiplication field first, and then m, followed by the construction of E’. See [78, 9] 
for more details. 


Whichever method is used, the final certificate of primality of n will be a sequence of 4- 
tuples (E;, m;, q;, P;), the first of which satisfies the conditions of the elliptic Pocklington 
theorem for the probable prime n, the second of which satisfies the conditions for the 
probable prime gq; dividing mj, the third of which satisfies the conditions for the probable 
prime q2 dividing m2, and so on. Given such a certificate, one can verify its correctness — 
and hence the primality of nm — in polynomial time. 

It should be mentioned that it has not been proved that one can always find an elliptic 
curve modulo n for which m is a small multiple of a prime. The difficulty is that little is 
known about primes in short intervals. One can avoid this problem by using genus-two 
hyperelliptic curves rather than elliptic curves (see [3]), but at the cost of having a much 
more complicated algorithm. The hyperelliptic curve primality test is of theoretical but not 
practical interest. 

We conclude this section by remarking that in cryptographic applications one does not 
usually need a polynomial time certificate of primality. Thus, one normally requires only 
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that n pass the Miller test for several values of a, or (if one wants to be completely certain 
of primality) that it pass the Jacobi sum test. 


7 Factoring 


Since the invention in 1978 of the RSA cryptosystem, the security of which depends upon the 
assumed difficulty of factoring large integers, more work has been devoted to the factoring 
problem than to any other problem in computational number theory. Despite some major 
advances, RSA seems still to be secure if the modulus n is the product of two primes of 
about 100 digits. 

All of the subexponential algorithms to factor general integers (1.e., integers not of any 
special form) are probabilistic rather than deterministic. Most of them (but not all) are 
based upon the following simple observation. Suppose that n is an odd number having r 
distinct prime factors. Then every square in (Z/nZ)* has 2’ square roots modulo n. Thus, 
if we can produce a congruence of the form x* = y” (mod n), where x and y are obtained 
independently of one another, then there is a 1 — 2~”—) probability that x # +y (mod n). 
It is easy to see that as soon as that happens, we need only compute g.c.d. (x + y,n) to 
obtain a nontrivial factor of n. Even in the “worst” case r = 2 (as in RSA), we have a 50% 
chance of factoring n any time we obtain a congruence x* = y* (modn). If this congruence 
does not lead to a factorization of n, then we simply repeat the procedure with different 
random choices, and we again have a 50% chance of success. The chance is only about 1 
in 1000 that we will fail to factor n within 10 repetitions of the algorithm. 


7.1 Quadratic Sieve 


One of the basic factoring algorithms of this type is called the quadratic sieve. Consider 
the quadratic polynomial f(X) = (X + [J/n])? —n € Z{X]. This polynomial has two 
obvious properties: (1) f(x) = (x + [/n])* (mod n) for any x € Z; and (2) f(x) is 
significantly smaller than n in absolute value (of order roughly ./n) if x is small in absolute 
value. Suppose that we evaluate f(x;) at a sequence of fairly small integers x), x2,..., 
and find that for some subsequence x;,, x;,,... the product of the f(x;,) happens to be a 
perfect square. Then all we have to do is set x equal to the least positive residue modulo n 
of [],(xi, + [/n]), and set y equal to the least positive residue modulo n of ./[ [, f(xi,); 
then we have x* = y* (mod n). Note that x and y were formed in quite different ways, 
so there is no greater-than-random likelihood that x = +y (mod n). That is, we obtain a 
congruence of the desired form. 

The hard part is to select the x;, so that ||, f(x;,) is a perfect square. To do this we 
select a set of primes — called a factor base — consisting of p,; = 2 and the first r — 1 
odd primes p2, p3,..., Pr for which (+) = | (1.e€., n is a quadratic residue modulo p;). 


The choice of r will be made later in a certain optimal way. If x; 1s small, then there is 
a good chance that f(x;) is not divisible by a prime greater than p,, in which case it is a 
product of the primes in the factor base. (Note that if p| f(x;), then n is a square modulo 


p; so f(x;) cannot be divisible by any prime p for which (4) = —]. This is why we took 
only primes for which n is a quadratic residue.) Thus, suppose we have a sequence x; of s 
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different values of x such that 


fai=[] pi’. i=be..s. 


j=! 


If s is large enough, then it should be possible to find a subset {x;,}, C {x;}, such that 


|] few =[] p 
l lj 


is a perfect square, 1.e., ar aj,; 1S an even number for each j = 1,...,7r. This method of 
factoring seems to have first appeared in print in [56]. 
A systematic way to find the required subset {x;,} of x1,..., xs is due to Brillhart and 


Morrison [17]. One simply regards the last condition — that the sum of the exponents 
a;,; be an even number— as an equation over the field F2, and use linear algebra. More 
precisely, letu;,..., us be unknowns taking the value 0 (if the corresponding x; is not in the 
subsequence {x;,}) or 1 (af the corresponding x; is in the subsequence). Then the condition 
we want is that 


S 
) > ajjui = 0 (mod 2) for fe Ty axet: 
i=l 
If s is somewhat greater than r, then this system has a nontrivial solution — in fact, several 
solutions. Each solution gives a different subset {x;,}, and hence a different congruence of 
the form x? = y? (mod n). As explained above, at least one of these congruences is almost 
certain to give us a nontrivial factor of n. 

It is important to choose r wisely. If r is too small, then it will take us a very long time to 
find even a single x; for which f (x;) can be written as a product of primes in the factor base. 
On the other hand, if r is too large, then our r x s system of linear congruences modulo 2 
will take us too long to solve (and we might also have difficulty storing such a large system). 
One can use a heuristic argument to show that r should be of order of magnitude roughly 
equal to exp(5,/log n log logn) (where log denotes natural log). In the case of the largest 
numbers factored by this method (for which logn ~ 200), r was chosen to be several tens 
of thousands. 

In 1981, Pomerance developed an important technique — called sieving — for speeding 
up the process of finding the large sequence {x; };=),...5 that 1s needed in the above algorithm. 
We briefly describe one version of this technique. The idea is to find many of the x; at once. 
Consider a long interval of T consecutive values of x. In each of T positions corresponding 
to the x’s, we store an approximation to log f(x). Next, for each odd prime p in our 
factor base, we find the two solutions mod p of the congruence f(x) = 0 (mod p). (This 
amounts to finding the two square roots of n modulo p.) We now go to the interval of T 
consecutive values of x, and run through the two arithmetic progressions with difference 
p corresponding to the two solutions of f(x) = 0 (mod p). For each such value of x, we 
subtract an approximation to log p from the stored value. (The stored value is equal to the 
approximation to log f(x) minus the sum of the approximations to log p; for each of the 
earlier p; for which x was in one of the two corresponding arithmetic progressions, 1.e., for 
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which p;|f(x).) After completing this procedure for all of the p; in the factor base, we 
look once more through the interval of T consecutive values of x. The locations where the 
stored number is close to zero will be the ones for which f(x) is a product of the p;. 

It turns out that the expected running time for the quadratic sieve factoring method is of 
the form exp((1 + o0(1)),/log n log log n). For a discussion of running time estimates for 
factoring algorithms — some of which have been rigorously proved, and others of which 
are based only on heuristic arguments — see [84]. For a more precise discussion of sieving 
in the above setting and also in the so-called multiple polynomial version, see [87]. 


7.2 Elliptic Curve Factorization 


A turning point in the application to cryptography of supposedly “‘pure” algebraic number 
theory was the discovery by H. W. Lenstra, Jr. of an ingenious method of using elliptic 
curves to factor integers. We now outline Lenstra’s algorithm. 

Let n be the large odd integer that we want to factor, and let p be an as yet unknown 
prime factor of n. Let Eg » denote the elliptic curve Y * — X34aX +bconsidered modulo 
n. Strictly speaking, E, » modulo n is not a group; however, it is a group when considered 
modulo any prime p|n. As in the case of elliptic curve primality testing, we use the usual 
formulas for the group operation on an elliptic curve, working always modulo 7, 1.€., we 
“pretend” that n is prime. In the steps described below, if we ever find ourselves unable 
to apply those formulas (because of a denominator that is not prime to n and hence not 
invertible modulo 1), we are almost certain to obtain a nontrivial factor of n, and we are 
done. (Namely, we take the g.c.d. of n with the troublesome denominator.) 

Let r be a positive integer chosen in a certain optimal way, let pj, ..., p, be the first r 
primes, and let m = m(r) denote [Tj=1 p;’ , where p;’ is the highest power of p; that is 
less than (n!/4 + 1)2. 

We start by choosing a random pair consisting of the coefficient a of our elliptic curve 
and a point P = (xo, yo) in the xy-plane modulo n. We want P to lie on the curve 
Eab: Y? = X34aX-+b, so the coefficient b is determined by P anda: b = Y% —Xx9 —Aaxo. 
(We should check that 4a? + 27b? is prime to n, so that our elliptic curve is not degenerate.) 
We next compute (or rather, try to compute) the point m P on Ey ». If we are able to compute 
the multiple m P (.e., if all of the denominators we encounter in the addition and doubling 
formulas are prime to n), then we have not succeeded in factoring n, and we must make 
another random choice of P anda. 

Eventually we are lucky, in that our curve E, » has the following property: for some 
prime p|n, the order of Eq» modulo p (i.e., the order of the group of F,,-points) is p,- 
smooth (this means that #E, »(F,) has no prime factors greater than p, ). If we suppose that 
p < Jn, this means that #E, ,(F,) divides m, because by Hasse’s theorem this order is 
< (n'/4 +41). In that case the point m P considered modulo p is the identity, i.e., the point 
at infinity. This situation will snow up in our computations in the form of a denominator 
divisible by p. At the same time, it is unlikely that the same denominator is divisible by 
n. Thus, in computing mP we will encounter a denominator whose g.c.d. with n gives a 
nontrivial factor (most likely the prime factor p). 

It remains for us to discuss what the probability is that the randomly chosen Eg » has the 
desired property. According to Hasse’s theorem, the order of Eg ,(F,) is in the interval 
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[p+1—2./p, p+1+2./p]. Moreover, it follows from work of Deuring and Schoof that as a 
and b vary, most of these orders fall in the interval (p—./p, p+./p), where they are roughly 
uniformly distributed (see [62] for details). Thus, the chance that #E,.,(F,)) is p--smooth 
is about the same as the chance that a random integer in the interval (p — ./p, p + ,/p) is 
Pr-smooth. 

Note that r must be chosen in an optimal way. If it is too small, then we might never find 
an elliptic curve whose order modulo p is p,-smooth. On the other hand, if r is too big, 
then it will take us too long to compute m P, since m = m(r) grows rather rapidly with r. 

At this point Lenstra needs an unproved conjecture to the effect that the standard results 
in [20] on the distribution of p,-smooth integers in the interval (0, x) as x — oo also apply 
to the smaller interval (x — /x, x + ./x) as x — oo. Assuming this plausible conjecture, 
one can find an optimal strategy (i.e., an optimal choice of r) so as to find the desired elliptic 
curve and factor n as rapidly as possible, and one can rigorously analyze the running time. 
As in the case of the quadratic sieve factoring algorithm, the expected running time turns 
out to have the form exp((1 + o(1)),/logn log log n). 

In the case of the hardest numbers to factor, in practice the elliptic curve factorization 
algorithm is not quite as efficient as its main competitor, the quadratic sieve method. For 
example, in factoring a number n that is the product of two primes of size roughly ./n, as 
in RSA, the best versions of the quadratic sieve are somewhat faster [87]. However, the 
elliptic curve method is the only subexponential general purpose factoring algorithm whose 
expected running time actually depends on the size of the smallest prime factor p of n. If 
p < Jn, then the fastest way to factor n is probably by elliptic curves. 

Until recently, all of the contenders for the best general purpose factoring algorithm had 
running time of the form exp(O(,/logn log logn)). Some people even thought that this 
function of n might be a natural lower bound on the running time of such an algorithm. 
However, in the early 1990s a new method — called the number field sieve — was developed 
that has a heuristic running time that is much better (asymptotically), namely: 


exp(O((log n)'9 (log log ny7/?)). 


The number field sieve is similar to the earlier algorithms that attempt to combine con- 
gruences so as to obtain a relation of the form x? = y* (mod n). However, one uses a 
“factor base” in the ring of integers of a suitably chosen algebraic number field. The details 


are quite complicated, so we shall not describe the algorithm here. See [61] for details. 


8 Discrete Logarithm 


Recall that the discrete logarithm problem in a finite field F, is the problem, given x € F), 
of solving the relation g” = x for the integer y. Two types of algorithms are known for 
solving this problem: (1) algorithms that are inefficient unless g — 1 happens to have no 
large prime factors; and (2) index calculus algorithms. 

An example of the first type is D. Shanks’ “baby-step/giant-step” method (see pp. 9, 
575-576 of [46]) combined with the Silver—Pohlig—Hellman algorithm [41]. We give 
a sketch of this algorithm. First, a routine argument based on the Chinese Remainder 
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Theorem reduces the problem to that of solving an equation of the form g” = x when g 
and x are p-th roots of 1 in Fy, where p is a prime divisor of g — 1. Now compute the 
two sets S} = {xg! lo<i</p and S2 = (giWVPE oj cp and compare them. When you 
have a match, i.e., xg! = g/(V?I+), you are done — just take y = j([,/p] + 1) — i. This 
method is practical if g — 1 is a product of small primes, but not if g — 1 is divisible by a 
prime p of 40 digits or more [79]. 

For a description of another important algorithm of the first type, due to Pollard, see [82] 
or [68]. 

When g — 1 has a large prime divisor, the most efficient algorithms are all of index calculus 
type. There are two variants of these algorithms: those that apply when g = p/ is a power 
of a small prime, and those that apply when g = p is prime itself or a very small power of 
p. In both cases there are algorithms having running time exp(O(,/log q loglogq)). See 
[2] for an algorithm that is subexponential in g = p/ as both p and f increase. 

Index calculus algorithms are probabilistic rather than deterministic, and in spirit they 
resemble the “combination of congruences” procedure used to produce a relationship of the 
form x* = y* (mod n) in factoring algorithms such as the quadratic sieve (see above). 

We shall outline the index calculus method in the case when gq is prime. For simplicity 
we shall suppose that g is a generator of Fj. The first part of the algorithm ts called 
precomputation. This means that it needs to be performed only once for a given field Fy. 
The second part of the algorithm must be performed for each individual logarithm, Le., it 
depends upon the particular x whose discrete logarithm we want. 

In the precomputation, let p;,..., p, be the first primes, where r is chosen in a certain 
optimal way. Then for random positive integers / < gq — | consider the least positive residue 
of g!. If this integer is divisible by a prime > p,, we move on to another /. But any time that 
the least positive residue of g! factors into a product of the pj, ..., py we obtain a relation 
of the form 


. 

Qiy l, ° *K 
I] 2; —& in Fj. 
jel 


For a € F9, let indga denote the discrete logarithm (also called the “index”’) of a to the 
base g. Then the last formula implies the following relation of exponents: 


; 
Y > ajjinds p; =1; (modg — 1). 
j=l 


This is a system of linear congruences in the unknowns ind,gp;, j = I,...,r. Once we 
have enough independent congruences of this form, we can solve for the discrete logs of 
the p;. It should be noted that, although solving a system of congruences is quite routine 
in principle, there are subtle questions that arise when one tries to decrease the time spent 
on this part of the algorithm (see [86]). (For a discussion of index calculus implementation 
in the case of the fields F5:, see [37].) 

Once the precomputation is complete, it is relatively easy to find y =ind,x. Namely, we 


choose random I’ and compute xgl until we find a least positive residue of xg! that has no 
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prime divisor > p,. Writing xg! as a product of the p; and using the values of ind, p; 


from the precomputation, we find the discrete log of xg! ', Then we simply subtract /’ to get 
the desired y =indgx. 

It remains to explain how r should be chosen. If r is too small, it will take too long 
to find g' for which the least positive residue has no prime factor > p,. On the other 
hand, if r is too large, the linear system in the r unknowns ind, p; will be too hard to 
solve. An optimal value of r can be chosen using the asymptotic behavior of the function 
w(X,s) = #{s-smooth positive integers < X}. (Recall that a positive integer is said to be 
s-smooth if it has no prime factors > s.) The result of this analysis is that, very roughly 
speaking, r ~ exp(,/log g log log q). 

Using such an optimal r, one can show that the running time of the algorithm is of the form 
exp(O(,/log q loglogq)). Most of this running time is devoted to (1) finding p,;-smooth 
values of g/i, and (2) solving the linear system of congruences for ind, p;. 

Recently, D. Gordon has modified the number field sieve factoring algorithm so as to 
obtain a discrete logarithm algorithm for F,. The result is also an index calculus algorithm, 
but it is much more complicated, because one works in the ring of integers of an algebraic 
number field. Although the discrete log number field sieve has not yet been extensively 
tested in practice, itcan be shown to have a much better asymptotic running time than earlier 
algorithms, namely exp(O (log!/? g log log?/? q)). 

We conclude our discussion of algorithms for factoring and discrete log by observing that 
there is thought to be a close relationship between the complexity of factoring an integer n 
and the complexity of solving the discrete log problem in F,, where q and n have the same 
order of magnitude. See [80] and [101] for discussions of this question. Until recently, all 
of the best algorithms for both problems had time estimates of the form L(n, 1/2), where for 
O < y < 1 one defines L(n, y) = exp(O(log” n log log!-Y n)). Then with the invention 
of the number field sieve for factoring [61], the time estimate for factoring was brought 
down to L(n, 1/3). Soon after, the number field sieve was also applied to the discrete log 
problem [35, 36], bringing the time estimate for discrete log down to L(q, 1/3) as well. 


9 Elliptic Curve Cryptosystems 


Soon after Lenstra showed how to use elliptic curves for factoring, Koblitz [47] and Miller 
[76] independently proposed using the abelian group of an elliptic curve as the basis for 
cryptosystems such as the Diffie-Hellman key exchange. We now describe how this works 
and discuss the corresponding discrete logarithm problem. 

We first define the discrete log problem in an arbitrary abelian group G. If the group law 
in G is written additively, and if P € G denotes a fixed base element, then the discrete log 
problem in G to the base P is the following: Given X € G, determine an integer y such 
that X = yP in G (or else determine that X is not in the subgroup generated by P). 

The Diffie-Hellman key exchange was first proposed for the group F*, but it can just as 
well be stated in any finite abelian group G. As before, Alice and Bob want to agree upon a 
large integer to serve as a key for some private-key cryptosystem. Alice and Bob first select 
a group G, a map from elements of G to the integers, and a base element P € G. All of 
this is discussed publicly. 
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Next, Alice and Bob choose random secret integers k,4 and kg, respectively. Alice 
sends Bob the group element k, P (but not her secret integer k,4), and Bob sends Alice the 
element kg P (but not the integer kg). The common key is then kakgP € G (or rather, the 
integer associated to k4kg P under the agreed upon correspondence). Alice determines this 
key by multiplying her secret integer k,4 times the element kg P received from Bob; Bob 
determines the key by multiplying kg times the element k, P that he received from Alice. 
An eavesdropper is in the difficult position of having to determine the element kakgP 
knowing only P,k4P and kgP, but not kg or kg. 

Now take G = E to be an elliptic curve defined over a finite field F,. Let P €¢ E bea 
fixed base point. The multiples k4 P, kg P, and k,kgP are computed using the addition 
law on the curve and the “repeated doubling” method. The secret key (known by Alice 
and Bob but no one else) is the point kakg P — or, more precisely, a certain positive 
integer corresponding to this point, for example, its x-coordinate in the case when Fy is a 
prime field. 

The main potential advantage of an elliptic curve cryptosystem is that the discrete 
logarithm problem seems to be harder to solve in the group E than in the group F9. In the 
latter group, the generally applicable discrete log algorithms that do not require exponential 
time — all of which can be categorized as “index calculus” algorithms — rely on the fact 
that one can find a fairly large set of “small’’ group elements in terms of which any other 
element can be expressed with small exponents. For example, if g = p is a large prime, 
then this set (called a “factor basis’’) might consist of all prime numbers less than a certain 
bound. On the other hand, if g = p/, where p is small and f is large, and if F, is regarded 
as F,,[X]/F(X) for a fixed irreducible polynomial F(X) of degree f, then the factor basis 
might consist of all irreducible polynomials of degree less than a certain bound, regarded 
modulo F(X). In the case of the group E, however, it does not seem to be possible to find 
such a factor basis. As argued in [76], the most natural notion of “‘small’” (the reduction 
modulo p of a point in characteristic zero having small height) does not work. 

Until the early 1990s, the only discrete log algorithms known for an elliptic curve were the 
ones that work in any group, irrespective of any particular structure. These are exponential 
time algorithms, provided that the order of the group is divisible by a large prime factor. But 
then Menezes, Okamoto, and Vanstone [71] found a new approach to the discrete log prob- 
lem on an elliptic curve E defined over Fz. Namely, they used the Weil pairing (see §III.8 
of [100]) to imbed the group E into the multiplicative group of some extension field Fx. 
This imbedding reduces the discrete log problem on E to the discrete log problem in Fok 

However, in order for the Weil pairing reduction to help, it is essential for the extension 
degree k to be small. Essentially the only elliptic curves for which k 1s small are the so- 
called “supersingular” elliptic curves, the most familiar examples of which are curves of 
the form Y* = X>-+aX when the characteristic p of F, is = —1 (mod 4), and curves of the 
form Y* = X* +5 when p = —1 (mod 3). The vast majority of elliptic curves, however, 
are nonsupersingular. For them, it was shown by Balasubramanian and Koblitz [12] that 
the reduction almost never leads to a subexponential algorithm. 

Thus, the first advantage of elliptic curve cryptosystems 1s that no subexponential algo- 
rithm is known that breaks the system, provided that we avoid supersingular curves and 
also curves whose order has no large prime factor. A second selling point is that, unlike 
in the case of the groups F@, there is a tremendous variety of elliptic curves from which 
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to choose. To take full advantage of this, we must have an efficient way to determine the 
order of a randomly chosen elliptic curve E defined over Fy. 

The first polynomial time algorithm to compute #E was discovered by René Schoof. 
Schoof’s algorithm is even deterministic. It is based on the idea of finding the value of #E 
modulo / for all primes / less than a certain bound. This is done by examining the action of 
the “Frobenius” (the p-th power map) on points of order /. 

In the original paper [96] the bound for running time was O (log® g), which is polynomial 
but quite unpleasant. At first it looked like the algorithm was not practical. However, since 
then many people have worked on speeding up Schoof’s algorithm (see [8, 58, 65, 74]). As 
a result it has become feasible to compute the order of an arbitrary elliptic curve over Fg if 
q iS a prime power of a few hundred decimal digits (this is several times more digits than 
one needs in practice). If this order is found to have a large prime factor, then the discrete 
log problem on it is intractable at our present level of knowledge and technology, and so 
the curve can be used for a secure cryptosystem. 

Several methods of choosing elliptic curves have been suggested. One possibility is to 
stick with a simple equation, such as Y* = X? + aX or Y* = X?+b. If the curve is 
supersingular, then it is trivial to compute #E. For example, if g = —1 (mod 4) for the first 
equation or g = —1 (mod 3) for the second equation, then #£ = q + 1. But in this case 
the discrete logarithm problem reduces to discrete logs in Fn [71], so cryptosystems using 


E have no strong advantage over cryptosystems using the multiplicative group of a finite 
field. But even if the curve is nonsupersingular, because of the special form of the equation 
there are algorithms to find #£ that are simpler and faster than Schoof’s algorithm and its 
variants. For example, computing the order of the curve Y? = X? + aX over F p» Where 

= | (mod 4), amounts to expressing p as a sum of two squares (see, for example, §II.2 
of [53]), and this can easily be done [15]. 

A second possibility is to start with a fixed elliptic curve E; over a small finite field Fg, 
and then consider the sequence E; obtained by regarding the same curve over the extension 
field F,;. The sequence N; = #E; is determined in a simple way from N) = #£}, using 
the congruence zeta-function (see, e.g., §II.1 of [53}). 

Another possibility is to generate elliptic curves in some random manner over a fixed 
finite field F,. One could take g = p to be an arbitrary prime whose size is roughly the 
same as that of the group E one needs. (By Hasse’s theorem, |#E — p — 1| is bounded by 
2,/p.) Alternatively, one could take g = 2! to be a fixed large power of 2. The choice 
of an elliptic curve in characteristic 2 has some advantages in practice, because arithmetic 
can be performed somewhat more efficiently in such fields. It has been suggested that a 
random elliptic curve over F5¢ would be particularly suitable for a cryptosystem that could 
be implemented on a “smart card,” where the space limitation is severe [73]. 

In each case, one must search for an elliptic curve whose order has a large prime factor; 
otherwise, discrete logs can be found quickly by the baby-step/giant-step and Silver—Pohlig— 
Hellman algorithms. 

Finally, it should be mentioned that, besides elliptic curves and the multiplicative groups 
of finite fields, other groups have also been proposed for Diffie-Hellman type cryptosystems: 
the jacobians of hyperelliptic curves [49], the class groups of imaginary quadratic number 
fields [18], and others. 
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10 Classical Number Theory Problems and Elliptic Curve 
Analogues that are of Cryptographic Interest 


10.1 Diffie-Hellman in ¥ 7 


When choosing a finite field F, for use in a Diffie-Hellman key exchange, it is crucial that 
the number of elements in the multiplicative group 


q-1=#F, 


be divisible by a very large prime. This is because, if g — 1 is “smooth” (not divisible 
by a large prime), then the cryptosystem is susceptible to the baby-step/giant-step and 
Silver—Pohlig—Hellman algorithms. 

The two most common ways to choose q are (1) to set g equal to a randomly generated 
prime p of sufficient size, and (2) to set g equal to 2”. In each situation we might ask what 
the chances are that #F7 is a prime number or a very small factor times a prime number. In 
the first case, the most we can hope for is that p — 1 be equal to twice a prime py: 


p=2pi+l. 


A prime p; such that 2p; + 1 is prime is called a “Sophie Germain prime,” since in the 1820s 
Germain proved the so-called “first case” of Fermat’s Last Theorem for prime exponents 
Pp. for which 2p; + 1 is prime. This was the first major result on FLT for a large class 
of exponents, and it attracted a lot of interest to the set of such primes p,;. Thus, we are 
led to the question: Are there infinitely many Sophie Germain primes; and if so, then for 
large x what is the probability that both p; and 2p; + 1 will be primes for p; ~ x? This 
is a difficult unsolved problem of number theory that was of interest to mathematicians for 
many decades for a reason unrelated to cryptography: people wanted to know whether the 
first case of Fermat’s Last Theorem had been proved for infinitely many primes. 

If we choose g = 2’, then it is possible for #F5, itself to be prime. This happens if r is 
prime and 2’ — 1 is a Mersenne prime. Thus, we are led to the famous Mersenne prime 
question: Are there infinitely many Mersenne primes; and, if so, what can be said about 
their frequency of occurrence for large r? 

If we want to work in finite fields of g = p’ elements for small p > 2, then, again 
choosing r prime, we can hope for (¢ — 1)/(p — 1) = (p’ — 1)/(p— 1) to be prime. Again 
we have a difficult unsolved problem of classical number theory: For fixed p, are there 
infinitely many r such that (p’ — 1)/(p — 1) is prime? Almost nothing has been proved 
about these questions. In fact, it is not even known that there are infinitely many primes of 
the form (p’ — 1)/(p — 1) as p and r both vary over the set of prime numbers. 


10.2 Elliptic Curve Cryptosystems 


As mentioned before, there are basically three approaches to choosing an elliptic curve for 
a cryptosystem. In each case one looks for a curve whose order #E has a very large prime 
factor, and in each case the question of the likelihood of encountering such a curve leads to 
some interesting unsolved problems. 
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10.2.1 Fix a “Global” Elliptic Curve and Vary the Prime 


For example, let E be an elliptic curve y? = f(X) = X?+aX +b defined over the 
field Q of rational numbers. If p is any odd prime not dividing the denominators of the 
coefficients or the discriminant of f(X), then one can consider the elliptic curve E over 
F, that is obtained by simply reducing the coefficients modulo p. That elliptic curve will 
always contain as a subgroup the image of the (finite) torsion subgroup Ejors of the curve 
over Q. But in many cases the quotient will have prime order. 


Question: Fora fixed curve E over Q, what can be said about the probability as p varies 
that 
#E mod p 


#E tors 


is a prime number? Can one prove (for any fixed E) that there are infinitely many p for 
which this number is prime? 


This question is analogous to the Sophie Germain prime question. Namely, if instead of 
E we take the multiplicative semigroup of nonzero integers, which has torsion subgroup 
{+1}, then we get our earlier question: As p varies, what can be said about the probability 
that p) = (p — 1)/2 = #F) /#{£1)} is prime? 

Just as the question about Sophie Germain primes is of interest when using a Diffie— 
Hellman type cryptosystem in the multiplicative group of a prime field F,, the analogous 
elliptic curve question given above is of interest when using an elliptic curve cryptosystem. 

It is worth noting that the denominator #£ jo; in the elliptic curve question is often |, and 
in any case it cannot be much larger than in the Sophie Germain prime question. According 
to adeep result of Mazur [67], there are at most 16 torsion points on an elliptic curve over Q. 

For a discussion of conjectures on the above question for elliptic curves, see [48]. 

Another natural question in this context is the following. Suppose that P is a point of 
infinite order in the group of the elliptic curve E over Q. As p varies, what is the probability 
that the image modulo p of P generates E modulo p? Results on this problem have been 
obtained by Gupta and Murty [39]. This question is the analogue of a classical problem 
of E. Artin, who conjectured formulas for the probability that a fixed integer (like 2 or 10) 
generates F”, as p varies. 


10.2.2 Fix an Elliptic Curve over a Small Field Fg and then consider 
it over Kar As r Varies 


Since E (F, ,') is a subgroup of E(F gr) whenever r’|r, large prime factors of #E (Fr ) are 
more likely to occur when r is prime than when r is composite. In the case of prime r, the 


best one can hope for is that 
#H(E(Fqr)) — fa” — 1)? 


#(E(Fy)) 


a-—l 


is prime, where @ is a reciprocal root of the numerator of the zeta-function Z(E/F ; T). 
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Question: For fixed E/¥,, what is the probability as r varies that the above number is 
prime? Can one ever prove that there are infinitely many r such that it is prime? 


Virtually nothing is known about this question. It is analogous to the classical Mersenne 
prime problem, as we see by replacing @ by 2. 


10.2.3 Fix the Field of Definition F, and Vary the Coefficients 


According to Hasse’s theorem, #£ falls in a rather small interval around g + 1, namely, 
[g+1—2./q,q+1+4+2/q]. As E ranges over all elliptic curves defined over F,, the number 
#E is distributed fairly uniformly in this interval, except that the density drops off near the 
endpoints (see [62]). Thus, the probability that #E is prime (or has a prime factor greater 
than some lower bound) is essentially the same as the probability that a random integer in 
an interval of the form [q, g + c./q] (c a constant) has this property. But unfortunately, 
at present almost nothing can be proved about the occurrence of primes in such “short” 
intervals. It is not even known whether there exists a c such that the interval [¢, gq + c./q] 
always contains at least one prime as g > oO. 

It is remarkable that practical questions relating to the security and the efficient imple- 
mentation of cryptosystems often lead to problems in number theory that are intrinsically 
interesting from a theoretical and aesthetic point of view; and many number-theoretic con- 
cepts and results that were once thought to have no conceivable practical application have 
recently played a role in the development and analysis of new cryptosystems. There is 
no a priori reason why we should have expected this to be the case. Indeed, the famous 
statement that Eugene Wigner made in reference to mathematical physics seems to apply 
to the subject of the present article as well: 

...the enormous usefulness of mathematics in the natural sciences is something bor- 

dering on the mysterious and ... there is no rational explanation for it. [102] 
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Recent Developments in the Mean Square Theory of the 
Riemann Zeta and Other Zeta-Functions 


Kohji Matsumoto 


The purpose of the present article is to survey some mean value results obtained recently 
in zeta-function theory. We do not mention other important aspects of the theory of zeta- 
functions, such as the distribution of zeros, value-distribution, and applications to number 
theory. Some of them are probably treated in the articles of Professor Apostol and Professor 
Ramachandra in the present volume. 

Even in the mean value theory, we do not discuss many important recent topics. Those 
include: Recent progress in the theory of large values and fractional moments made 
by Heath-Brown [55] and the Indian school (Ramachandra, Balasubramanian, 
Sankaranarayanan and others, see Ramachandra [172]); mean values taken at the zeros or at 
the points near the zeros (Gonek [30] [31], Fujii [23]-[26] and others); the mean square of the 
product of the zeta-function and a Dirichlet polynomial (see Conrey-Ghosh-Gonek [19] and 
the papers quoted there). All of these three topics are closely connected with the distribution 
of zeros of zeta-functions, hence the full account of them would require too many pages. We 
will only discuss the theory of Titchmarsh series very briefly in Section 7. In the fourth power 
moment theory there have been remarkable developments which may be characterized by 
the use of the spectral theory of Maass wave forms. We mention this theory occasionally, 
but only in connection with the mean square problems. For the full details of this theory, 
see Chapters 4 and 5 of Ivic [68], Motohashi’s book [155], and Jutila’s series of papers. 

In the present article we only discuss the mean square theory of zeta-functions. This is 
a rather restricted topic, but still it is impossible to mention all the relevant results because 
the recent progress in this area is very big. The main tools appearing in this article are the 
approximate functional equations and Atkinson methods, emphasis are laid on the latter. 
The readers will find, however, that these two tools are not irrelevant (see Sections 4 and 
6). Efforts are made to explain the mutual connections among various methods and results. 

In Section 1, we summarize the results on the mean square /,(T) of the Riemann zeta- 
function, obtained by applying various approximate functional equations. In Sections 3 
and 5, /,(T) is studied from the viewpoint of the method of Atkinson. The background of 
Atkinson’s method is the divisor problem, which is mentioned in Sections 2 and 6. Then, 
after a brief discussion on some short interval results in Section 7, we proceed to survey 
the results on more general zeta and L-functions. Sections 8, 9, 10 and 11 are devoted, 
respectively, to the mean square theory of Dedekind zeta-functions, L-functions attached 
to cusp forms, Dirichlet L-functions, and Hurwitz zeta and other related zeta-functions. 

Throughout this article, s = o +it 1s acomplex variable, f(s) the Riemann zeta-function, 
I'(s) the gamma-function, y the Euler constant, #(n) the Euler function, d(n) the number 
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of positive divisors of n, and og(n) = are d“, When x tends to infinity, f(x) ~ g(x) 
means liMx-+o0 f(x)/g(x) = 1, f(x) = O(g(x)) or f(x) & g(x) means | f(x)| < Cg(x) 
with acertainC > 0, f(x) = Q4(g(x)) (resp. f(x) = Q_(g(x))) means that f(x,) > 
Cg(xp,) (resp. f (xn) < —Cg(x,)) holds for infinitely many x, such that x, — oo, witha 
certain C > O, and f(x) = Q(g(x)) means that | f(x)| = Q4(g(x)). The letter « denotes 
an arbitrarily small positive number, C, C,, C2, --- denote certain constants, which are not 
necessarily the same at each occurrence. The references are by no means complete. 


The author expresses his gratitude to Professors Martin N.Huxley, Aleksandar Ivic, Matti 
Jutila, Shigeru Kanemitsu, Masanori Katsurada, Isao Kiuchi, Shin-ya Koyama, Antanas 
Laurincikas, K. Ramachandra, Vivek V. Rane, Yoshio Tanigawa and Kai-Man Tsang for 
valuable comments and information. He is also indebted to Miss Yumiko Ichihara for her 
laborious work of typesetting this long article. 


1 The Approximate Functional Equations 


A classical problem in the mean value theory of [(s) is to search for the asymptotic formula 
of the mean square 


T 
I(T) = | f(a + it)|?dt, 
0 


where T > 2. (If o = 1, we replace the interval of integration by [1, T].) In view of the 
functional equation f(s) = x(s)¢(1 — s), where 


x¥(s) = 2(2n)s! sin( $5 rc — S$), 


we may restrict our consideration to the case o > 1/2. Wheno > 1, the asymptotic 
formula 


I, (T) ~ €Qo0)T (1.1) 
is an easy consequence of the definition of €(s). It was proved by Landau [127, §228, 


p. 816] and Schnee [180] that (1.1) holds for any o > 1/2. To prove this fact, the simple 
approximate formula 


l-s 
rsy= Son -8 +O(E~°) — (|t| <7) (1.2) 


= i) 


is enough (see Titchmarsh [190, Theorem 7.2]). The most difficult case 0 = 1/2 was 
settled in 1918 by Hardy-Littlewood [47], who proved 


I(T) ~ TlogT, (1.3) 
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by using the Mellin transform. Five years later, Hardy-Littlewood [49] gave an alternative 
proof of (1.3). It is based on the approximate functional equation 


c(s)= bon +x(s) > ne! + Rilss 0), (1.4) 


n<é nsn 


which is a refinement of (1.2). Here &, n are positive, 27&n = t, and 
—oO o-1,4-o 
Ri(s;&,) = O(8-° +? '12-%). (1.5) 


The formula (1.4) first appeared in Hardy-Littlewood [48], with a slightly weaker error 
estimate than (1.5), and the main instrument of the proof in [48] is the Poisson summation 
formula 


b Co pb 
> fin) = f f(ujdu+2)° f (u) cos(271nu)du (1.6) 


a<n<b n=1°4 


(for f € C![a, b]; the symbol )~ indicates that 5 f(a) and 5 f (b) are to be taken instead of 
f(a) and f(b), respectively). In [49], Hardy-Littlewood presented an alternative complex- 
analytic proof of (1.4) and (1.5). 

The formula (1.4) and its relatives are really useful, and dominated the next sixty years 
of the mean value theory. Littlewood [134] announced that 


I(T) = TlogT — (i + log2x — 2y)T + E(T) (1.7) 


with E(T) = O(T3/4+*) (actually the term 2y is missing in [134]). Ingham [63] improved 
itto E(T) = O(T!/? log T). Further improvement was done by Titchmarsh [188], who 
proved 


E(T) = O(T® log? T). (1.8) 


Titchmarsh succeeded because he could use the Riemann-Siegel formula, proved by Siegel 
[182], which gives the very precise asymptotic expansion of Ri (s; /t/27, ./t/27). (See 
Titchmarsh [190, Theorem 4.16].) 

Hardy-Littlewood [49] also studied the fourth power moment 


ij 
In.o(T) =f IS(o + it)|*dt, 
0 


and they showed 


bo (Ty ~ LO +r : ') 1.9 
2,0(T) r(4a) (; <o< (1.9) 


and 


hi()= O(T log* T). (1.10) 
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Ingham [63] improved (1.10) to 
2,4(T) = (2x *)-!T log* T + O(T log? T), (1.11) 


by using the approximate functional equation of £7(s) due to Hardy-Littlewood [50], that 
is 


aa 


(1.12) 


26) - rd , 
ss) = + 


nsx ia 


for positive x, y with 42xy = t*, where 


1 
4 
Ro(s;x,y) =O (se () ve*] (1.13) 


In [189], Titchmarsh gave a different proof of (1.12) with 
Ro(s;.x, y) = O(x2~° logt). (1.14) 


Recall that the original proof of (1.4) by Hardy-Littlewood is based on (1.6). Hardy- 
Littlewood deduced 


c(s) = pon 


n<xX 


~* cos(27nu)du + O(x-°) (1.15) 


from (1.6) in the first stage of their argument. As an analogue of (1.15), Titchmarsh [189] 
proved 


laa _ s—s* 4. S = i 
2 Ss x! Ss l-s $ 
=) i Sod ———x!-5(2y 41 i 
pe ee -_ M47 a eg ee 
3 K\(u)+ 2Y\(u 
4m ./nx ue 


where K, and Y; are Bessel functions. This is the basis of Titchmarsh’s proof of (1.14). 

A climax of applications of approximate functional equations to the mean value problems 
came in the late 1970s, with the works of Balasubramanian, Good and Heath-Brown. A 
very careful analysis based on the Riemann-Siegel formula enabled Balasubramanian [4] 
to obtain the explicit formula 


E(T) = 
“<a (mn)? log(# ) 


sin(20 T logmn 
ae ee ee Oaee, rT), (1.17) 
m.n<K (mn)? (26/ — log mn) 
men 


sin(T log(#)) 
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where 6; = 6\(T) = ST log(T /27) -- $T -- aT, 0,’ is the derivative of 6;, and K = 
[((T /22)*/?]. Then, applying his own idea of multiple integration process to (1.17), 
Balasubramanian obtained the estimate of the form 


E(T) = O(T“*®) (1.18) 


with a certaina < 1/3. (In [4], the value a = 27/82 was given.) Good [34] proved an 
explicit formula of E(7) similar to (1.17) but with certain smoothing factors, and from this 
formula he [35] proved 


ECT) = OC), (1.19) 


Heath-Brown’s work [53] gives an improvement on (1.11). He proved a new type of 
approximate functional equation, from which he deduced 


4 
1,1 (T) — T So aj log! T + E2(T), 
J=0 


where a;’s are constants, aq = (272)7!, and 


E)(T) = O(T#*®). (1.20) 


Heath-Brown’s paper also includes an alternative proof of (1.18). 

Inspired by these papers, strong interests in the mean value problems revived. Really 
big progress has been made since 1980, which is the main theme of the present article. But 
first, we will discuss a closely related problem concerning the behaviour in mean of the 
divisor function in the next section. 


2 The Dirichlet Divisor Problem 


The title of this section means the problem of evaluating the error term A(x) defined by 


\ din) = xlogx + (2y — Dx+5 + A(x) (2.1) 


NnSX 


for x > 2, where )~’ indicates that the last term is to be halved if x is an integer. As can be 
observed by comparing (2.1) with (1.7), there is a strong analogy between A(x) and E(T). 
Usually the study of A(x) is easier than that of E(T), hence the results on A(x) are quite 
suggestive of guessing the behaviour of E(T). Here we quote several known facts on A(x). 
Dirichlet himself proved A(x) = O(x!/ *) and Voronoi [195] improved it to obtain 


IA Xe O(x3 log x). (2.2) 
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The explicit formula 


A(x) = -<xt {Ki (4x-/nx) + 5 Yi (4x Jnx)| 
n=] n2 
xt F iw lenlindes 
= ie (n)n— cos (4x nx ~ =) 


OO 
l 


— aa Ra er. sin (4x vax - =) +0078 4) (2.3) 


is due to Voronoi i Sometimes the truncated form 


A(x) = s » d(n)n~ tees (4x vax a 7) + O(x® 4 2+ NO 2) (2.4) 


n<N 
is useful. For instance, the formula 
x 4/3 
C5) 2O3 
A? (x)dx = —~2— KX? +8(X) (2.5) 
f 6m°£(3) 

with 6(X) = O(X°/4+®) due originally to Cramér [21], can be proved by substituting (2.4) 
into the left-hand side and squaring them out. Tong [191] obtained the improved estimate 
6(X) = O(X log? X), and an alternative simple proof was given by Meurman [142]. The 

best estimate at present is 
6(X) = O(X log’ X) (2.6) 


due to Preissmann [164]. 
As for the real order of 6(X), the author (1992) conjectured (cf.[115]) that d(x) ~ 
CX (log X)® with a certain B > 0. Lau-Tsang [128] proved 


x 
| 
/ 8(x)dx = aa log? X + C, X* log X + O(X’), (2.7) 
2 8 
which implies 5(X) = Q_(X log’ X), and conjectured 


1 
6(X) = —7—5X log? X + CoX log X + O(X). (2.8) 


Moreover, Tsang [193] proved that (2.8) is valid for almost all X in a certain mean 
value sense. A generalization of the result of Lau-Tsang [128] was recently obtained 
by Furuya [27]. 

Another mean value formula for A(x) is 


X 1 4 CO Bt: as 
/ A(x)dx = xi > d(n)n 4 sin (4x vax - =) 
: n=] 


2/21? 


15 ge 7 A 
x4 d(n)n 4 cos (40vi — 4 + O(1), (2.9) 
64/213 ps 4 
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which is due to Voronoi. Recently the mean value of the above quantity was studied in 
detail by Furuya-Tanigawa [28]. In several references (2.9) was quoted incorrectly. Note 
that sometimes A(x) is defined by (2.1) without the term 1/4; then the term 1X should be 
added on the right-hand side of (2.9). 

The formula (2.5) includes the fact A(x) = Q(x'/4), and furthermore, it is known that 


A(x) = Q_{x4 exp(C (log log x)4 (log log log.x)~4)} (2.10) 


(Corradi-Katai [20]) and 


3+log 4 
A(x) = Q4y{(x log x) (log log x) a exp(—C/ log log log x)} (2.11) 


(Hafner [45]). In view of these Q2-results, it is quite plausible that 
A(x) = O(x4t®). (2.12) 


This is indeed a classical conjecture, but is believed to be extremely difficult. At present, 
the best known upper-bound is 


A(x) = O(c) (2.13) 


due to Huxley [59]. This is just a small improvement on (2.2), but such a kind of improve- 
ment requires quite hard analysis on exponential sums. That is, we should use the theory 
of exponent pairs, created by van der Corput, and refined by many authors. For instance, 
Kolesnik [120] proved A(x) = O(x*°/!98+) by using his own elaborated version of the 
theory of exponent pairs, and later he [121] improved the exponent to 139/429 + e. 

Bombieri-Iwaniec [13] [14] invented a new method of treating exponential sums, which 
gives an essentially new exponent pairs (see Huxley-Watt [62]). Combining this method 
with the expression 


x 
A(x) = —- _ , 
(x)= —-2)0 v(=) + O(1), (2.14) 
n<.Jx 
where W(x) = x — [x] - 5, Iwaniec-Mozzochi [80] obtained the estimate A(x) = 


O (x 1/2248), Huxley achieved to prove (2.13) by a further refinement of the method of 
Bombieri-Iwaniec. For the details of the theory of exponent pairs, the readers are referred 
to Graham-Kolesnik [39] or Huxley [61]. 


3 The Atkinson Formula and the Recent Results on E(T) 


As we mentioned in the previous section, there is an analogy between A(x) and E(T). 
Therefore it 1s natural to search for a formula analogous to Voronol’s (2.3) or (2.4). This 
was carried out in 1949 by Atkinson [3]. To state his result, we prepare several notations. 
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Let X = T (i.e. T < X « T), arsinh x = log(x + V1 + x2), and define 


I l 
wmn\ 4/2T\ 2 locn = 
T, =- |1l+— — inh .; — : 
e(T,n) ( +) (=) (arin iar) 
f(T,n) = 2Tarsinh [= + (n+ 2nnT)? 5, 


T It 
g(T,n) = T log (=) —-T+— 
270n 


a” 
B(T = f ope (= a0 *) 
( ,§) = nr 5) E on 45 ’ 
Dee Xx) = Yi(=) LEC Morne lant F(T. n) conc f(T, n)) 


and 


1 
fT \2" _] 
ce xX) = (=) S° O1—2¢ (n)n® 
n<B(T.V/X) 


271n 


T \7! 
x ( log a} cos(g(T,n)). 
Then Atkinson’s explicit formula can be stated as 
oes 2 
E(T)= daa On ae X) + O(log? T). (3.1) 


The starting point of Atkinson’s proof of (3.1) is the product ¢(u)f(v), where u and v 
are independent complex variables. At first assume Re u > 1, Re v > 1. Then 


COO CO 
c(u)t(v) = D> > om“ n™. 
m=1n=1 
Divide this double sum into three parts according to the conditions m = n, m > n and 
m <n (Atkinson’s dissection). The part corresponding to m = n is clearly ¢(u + v). 
By using the Poisson summation formula (1.6), Atkinson showed an integral expression 
of the remaining parts, which enables the analytic continuation. Then he transformed this 
expression by applying Voronoi’s formula (2.3), and evaluated the resulting integrals by his 
own saddle-point lemma, which gives an asymptotic formula for the integral of the type 


b 
| g(x) exp(27i(f (x) + kx))dx (3.2) 


with real k and certain functions f(x) and g(x). The details of the proof are rather long 
and complicated. 

When Atkinson published (3.1), no one noticed its usefulness. After the disregard during 
about thirty years, Heath-Brown first gave attention to Atkinson’s paper. In [51], Heath- 
Brown proved 


T 4/3 
i E2(1)dt = 263) 73 + F(T) (3.3) 
2 3(27)2 (3) 
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with 
F(T) = O(T? log? T), (3.4) 


as an application of (3.1). This is the analogy of (2.5), and implies (1.19) of Good. Heath- 
Brown’s another paper [52] deduced the estimate 


[G9 


from a certain estimate of the mean square of |¢(1/2 + it)| in short intervals. Atkinson’s 
formula (3.1) was used to prove the latter estimate. (See also Chapter 7 of Ivic [66] for a 
different proof.) 

These works of Heath-Brown showed the fruitfulness of Atkinson’s formula (3.1), but 
it was Jutila [811] who noticed the real value lying in Atkinson’s method. He sketched in 
[811] how easily can (1.18) be obtained from (3.1). Following Jutila’s idea, and combining 
with Kolesnik’s technique [120], [vic described a proof of 


12 
dt = O(T’ log!’ T) (3.5) 


E(T) = O(Ti0+t®) (3.6) 


in Section 15.5 of [66]. 
The main theme of Jutila’s aforementioned paper [811] is the analogy between E(T) and 
a modification of A(x), that is 


I 
A*(x) = —A(x) + 2A(2x) - pen 
A consequence of his analysis is the hypothetical bound 
E(T) = O(Ti6*®) (3.7) 


under the assumption of the conjecture (2.12). The exponent 5/16 1n (3.7) was later improved 
to 3/10 by Jutila [8 IIL]. In [82], Jutila proved 


T 2 
| (Ew — anat()) dt = O(T3 log? T). (3.8) 
9) 27 


The transformation method for Dirichlet polynomials was created by Jutila [83]. The 
basic tool of this method is Voronoi’s summation formula 


j b 
Y) din) f(n) = | f (u)(log u + 2y)du 


a<n<b 
0° b 
+ Yam | f(uy(4Ko(4a J/nu) — 20 Yo(4n J/nu))du, (3.9) 
n=] 


valid for f € C2[a, b], where Ko and Yo are Bessel functions. Jutila’s idea is to transform 
the Dirichlet polynomial 


S(M,,Ma;t)= > d(m)m=27 (3.10) 


M\<m<M) 
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by applying (3.9), and then use a lemma of Atkinson’s type to evaluate the resulting 
expressions. One of his results is an explicit formula for |¢(1/2 + it)|?, whose shape 
is similar to Atkinson’s formula. Several new ideas are included in his argument. One of 
them is the device of multiplying the original polynomial (3.10) by trivial factors e27!””, 
where r is an integer. Another novelty is the multiple-averaged version of Atkinson’s 
saddle-point lemma, which gives an asymptotic formula for the integral of the form 


U U b—uy—-+—uy 

Ue. | du,-- [ duy | g(x) exp(27i(f (x) +kx))dx (3.11) 
0 0 A+tuU,+e-+uUy 

instead of (3.2). This point was fully developed in Jutila [87]. We will encounter the 

transformation method again in Sections 7 and 9. 


By using the above averaged saddle-point lemma of Jutila, Meurman [142] improved 
(3.4) to 


F(T) = O(T log? T). (3.12) 


In fact, Meurman proved an averaged version of Atkinson’s formula, which can be stated as 
* * il 
E(T) =) Dees O(T~4logT), (3.13) 


where vid (T) is acertain weighted sum similar to an \ (T, X)(j = 1, 2). The deduction 
of (3.12) from (3.13) is basically analogous to the argument of Heath-Brown [51]. 

The estimate (3.12) was proved also by Motohashi [149IV] [150] independently. From 
his asymptotic formula for R2(s; t/277) (see the next section), Motohashi deduced another 
version of Atkinson’s formula, and from which he obtained (3.12). Motohashi’s argument 
includes an alternative proof of the original formula (3.1) with a slightly better error term 
O(log T). Note that another different proof of (3.1) was obtained by Jutila (92]. His 
argument is based on the Laplace transform of |¢(1/2 + it)|*, and does not appeal to 
Atkinson’s dissection device. 

Inspired by Preissmann’s proof of (2.6) (which is an application of the inequality (8.12) 
below due to Montgomery-Vaughan), Preissmann himself [165] and Ivié [68, (2.100)] 
independently of each other proved 


F(T) = O(T log’ T), (3.14) 


which is the best at present. There is a conjecture of the author that F(T) ~ CT (logT)® 
would hold with a certain B > 0. Probably B = 2. 
Higher power moments of E(t) were first studied by Ivi¢ [64], who proved 


ere 35 
| [E(t)|“dt = O(T'T4**) (0 <a< 4 (3.15) 


and 


P 384+35a@ 
| \E(t)|*dt = O(T 18 +) («> 2). (3.16) 
2 
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Heath-Brown [56] proved the existence of the limit 


T—-oco 


T 
lim T7!-4 / |E(t)|%dt (3.17) 
2 
for 0 < a@ < 28/3. Tsang [192] obtained 


T : : 
/ E(t)kdt = C(K)T'+4 + O(T!+4-) (3.18) 
2 


for k = 3 or 4, where C(k) 1s an explicitly written positive constant, and 6 > 0. Recently 
Ivic [73] showed, using (3.8), that one can take 6 = 1/14 fork = 3 and6 = 1/23 fork = 4. 
In view of the Q-result (1.19), it is plausible that 


E(T) = O(T#4*°), (3.19) 


as an analogue of (2.12). The above results on higher power moments can also be regarded 
as supporting facts of this conjecture. The best known upper-bound is, however, still far 
from this conjecture. Heath-Brown and Huxley [57] applied the methods of Bombieri, 
Iwaniec and Mozzochi (mentioned in Section 2) and some lemmas proved in [53] to obtain 
E(T) = O(T’/** (log T)!!!/22), which is better than (3.6), and this was further improved to 


E(T) = O(T®1 (log T) 27) (3.20) 


by Huxley [60]. 
The Q-result (1.19) was refined by Hafner-Ivi¢ [46], who showed, analogously to (2.10) 
and (2.11), that 


E(T) = Q_{T# exp(C (log log T)4 (log log log T)~4)} (3.21) 
and 
| 34+log4 
E(T) = Q,{(T log T)4 (loglog T)~ 7 exp(—C v log log log T)}. (3.22) 


The local behaviour of sign-changes of E(T) was studied by Ivié [67], Ivic-te Riele [78], 
and (independently) Heath- Brown and Tsang [58]. Ivi¢ [67] showed that there exist positive 
constants C; and C2, such that every interval [7, T7+C, /T (for T > Ty) contains numbers 
t1, t2 for which 


E(t) > Cat), E(ty) < —Cot,/" 


hold. This result 1s also included in Heath-Brown and Tsang [58]. 
From Atkinson’s formula Hafner-Ivié [46] deduced, as an analogue of (2.9), that 


T 3 60 
| E(t)dt = nT + 5(S) eS sin (2vaer — 4 
2 ie ee 8 oa - 4 


3 
4 


+ O(T#logT). G25) 
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This implies that the function E(t) has the mean value z. Hence it is natural to consider 
the zeros of the function E(t) — 2, which we denote by t, (2 < t) < t2 < ---). Then the 
above mentioned result implies that 


1/2 


In+] —In Ky (3.24) 


Letxk = inf{c > 03; t41 —th K tf}. Then (3.24) implies that « < 1/2. Ivi¢-te 
Riele [78] studied {t,} both theoretically and numerically, and proposed the conjecture that 
x = 1/4. This conjecture is very strong because it would lead to (3.19) (see Theorem 1 
of [78]). However, this conjecture was disproved by Heath-Brown and Tsang [58]. They 
showed that for any 6 > O and any T > T7o(6), there are at least C\6T!/? log? T disjoint 
subintervals of length C76T'/* (log T)~° in {T, 2T] such that 


£7(3) 


2(27 )4(3)2 


whenever ¢ lies in any of these subintervals. In particular E(t) does not change sign in any 
of these subintervals. Therefore the local behaviour of E(t) is much more mysterious than 
was expected by Ivi¢ and te Riele. 


4 The Remainder Term in the Approximate Functional 
Equation for £7(s) 


In the middle of 1980s, new light was shed on the remainder term R2(s; x, y) in (1.12). 
Jutila [83] pointed out that (1.12) with (1.14) can be deduced from the Voronoi summation 
formula (3.9); this should be compared with the fact, mentioned in Section I, that (1.4) can 
be deduced from the Poisson summation formula (1.6). The details are presented in Ivi¢ 
[65] [66] (see Section 4.2 of [66]). 

In the special case x = y = t/27 (“the symmetric case’’), Motohashi [1481] showed that 
the estimate (1.14) for Ro(s; t/22,t/27) (which we abbreviate as R2(s; t/27)) follows 
from the estimate (1.5) for Ri (s; /t/27, /t/27) by the Dirichlet device. However, we 
know much more precise information on R1(s; ./t/27, ./t/2z.), that is the Riemann-Siegel 
formula. Whatcan we obtain if we combine Motohashi’s argument with the Riemann-Siegel 
formula? This idea was pursued by Motohashi himself, and he [148IT] [150] [153] obtained 
a very precise asymptotic formula of R2(s; ¢/27). His result includes 


—1 


x= 9R(55 3) = (32) Lamhe sin (ava + 7) 
aE O(r? log fr), (4.1) 


where 


h(n) 


2 2 fo al 1 
— | (u+nm) 2cos{u+— }du 
It 7) 4 


1 11 5 
= ——n 27+ O(n 2). 
I 
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Actually Motohashi’s formula is more precise and complicated; the error term in (4.1) is 
replaced by some more explicit terms and a smaller error O(t~! logt). A simple conse- 
quence of (4.1) and (2.3) is the relation 


| 
= 
xd - 5)Ra(s : _ = -vi( =) a(-) + O(t~4), (4.2) 


which was announced in [148I]. A formula of the same type was already given long before in 
Taylor’s posthumous article [186], but Motohashi [153] pointed out that Taylor’s argument 
was incorrect. 

Jutila [85] gave an alternative proof of (4.2). His starting point is Titchmarsh’s explicit 
formula (1.16). His idea is to smooth the right-hand side of (1.16) by using multiple 
integration, and then apply his own saddle-point lemma mentioned in Section 3. So far 
Jutila’s method cannot give a proof of Motohashi’s precise formula ((4.1) and more). An 
advantage of Jutila’s approach is that it can be applied to many other Dirichlet series, 
satisfying a certain functional equation. In [85], Jutila presented analogous results on 


g(s, F) =) a(n)n™, (4.3) 
n=|\ 


where a(n)’s are the Fourier coefficients of aholomorphic cusp form F(z) = ee a(n) exp 
(27tinz) of weight « (an even integer) for the full modular group SL(2, Z). There are many 
analogous properties shared by c*(s) and y(s, F), but a big difference is that ¢?(s) has the 
good square-root function (i.e. ¢(s)), while g(s, F) does not. This is why Motohashi’s 
approach cannot be applied to y(s, F). Recently, Guthmann [42] [43] [44] has developed 
another unified approach to the remainder terms in the approximate functional equations 
for r*(s) and y(s, F). 

The formula (4.2) tells that a strong analogy between R2(s ; t/27) and A(t/277) should 
exist (cf. Ivié [69]). Kiuchi-Matsumoto [114] proved, as an analogue of (2.5), that 


T ae 
/ Rp were —_)| dt = V2nCoT? + K(T), (4.4) 
2. 2 2a 
where 
St 1 
Co = \_d?(nyh?(n)n-? (4.5) 


n=l 


and K(T) = O(T4 log T). The proof is based on (4.1). Using a more precise form of 
Motohashi’s formula, Kiuchi [111] gave the improved bound K(T) = O(log? T), and 
suggested the conjecture 


K(T) ~ C log? T. (4.6) 
The hitherto best upper-bound is 


K(T) = O(log’ T) (4.7) 
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due to Kiuchi [11311]. On the other hand, Ivi¢ [72] proved 


T 
| K(t)dt = C,T log? T + CoT log” T + O(T log T) (4.8) 
2 


with C; < 0, analogously to (2.7). This implies 
K(T) = Q_(log’ T), (4.9) 


which supports the conjecture (4.6). 

Motohashi’s formula has been obtained in the symmetric case x = y = t/27. How is 
the non-symmetric case? Motohashi [148III] [150] proved a formula when x = at/2z, 
y = t/2ma with a rational number aq, but the result is not so precise as in the symmetric 
case. Jutila [85] considered the general situation, and in some non-symmetric cases his 
bound is better than (1.14). See also Jutila [86]. Mean-value results in the non-symmetric 
case were discussed by Kiuchi [112]. 


5 The Mean Square of ¢(s) in the Critical Strip 

Now we return to the problem of evaluating /,(T), and discuss the case 1/2 < o < l. 

After the classical result (1.1) of Landau and Schnee, the development in this direction 

had been very slow (cf. Ingham [63] and (8.112) of Ivié [66]). In 1989, the author [135] 

published the analogue of Atkinson’s formula in the strip 1/2 < o < 3/4. Itis stated as 
E,(T) = Ere Kaye aCe X) + O(log T), (5.1) 

where FE, (T) is defined, for 1/2 <o < 1, by 


20-1 §(2 — 20) 72-26 


I(T) = $(20)T + 20)?! 


+ E, (1). (5.2) 


It can be easily seen that E,(T) — E(T) aso — 1/2 +0. Inthe same paper [135], as 
applications of (5.1), the author showed that 


E,(T) = O(T@* log? T) (5 Lee ;) (5.3) 
(the O-constant may depend on o) and 
T 
Z 2? 20 | 3 
/ E,(t)*dt = A\(o)T2 + Fo(T) (5 <o< *) (5.4) 
2 


with F,(T) = O(T’/4-° log T), where 


— 2 gyro 4 $63” (5 -20)>(3 
= ee * Fy ° 7 20)r (5 +20). 
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Independently of [135], Laurincikas [129I] obtained the analogue of Atkinson’s formula 
near the critical line, that is an explicit formula for the error term in the asymptotic formula 
of the integral 


T 
| IC(or + it)|7dt, 
0 


where oy = 1/2 + Ley 0 < l/r < logT, and /7 tends to infinity as T — oo. See also 
Laurin¢cikas [12911] [131] [132] [133]; in [132], he proved that (5.3) (with aslightly different 
log-factor) holds uniformly ino. 

An asymptotic formula over short intervals was obtained by Sankaranarayanan-Srinivas 
[179] by a quite different method. They proved 


| pT+H i Ci (log T)?-” 
af Ig(o + in) at = £(20) + [exp (- TS J) 


for exp((log T)*~*") < H < T and (1/2) + Co(log log T)~! <o < 1 — C3 under the 
assumption of the Riemann hypothesis. It should be noted that their method can be applied 
to much more general Dirichlet series. 

The basic tool of the author [135] is Oppenheim’s Voronoi-type formula [161] for the 
error term A j_2, (x) defined by 


§(2 — 20) 2-29 


Dd O1-20(n) = §20)x + 


nsx 


I 
= ri Cad — 1) + Aj-2o(x). (5.5) 


The series in Oppenheim’s formula is convergent only fora < 3/4, whichis the reason why 
the restriction 1/2 < 0 < 3/4 exists. It was pointed out in [135] that the coefficient Aj(o) 
tends to infinity when o — 3/4 — 0, which suggests some singular situation occurring at 
ao = 3/4. Now we know that the behaviour of E, (T) in fact transposes ato = 3/4, which 
can be well observed by the following refinement of (5.4): 


7 A\(o)T?-% + O0(T) (4 <0 < 3) 
| E,(t)*dt = { AoT logT + O(T) (o = j) (5.6) 
: O(T) (3 <0 <1), 


where Ap = ¢7(3/2)¢(2)/f(3). These are due to Matsumoto-Meurman [140II] (1/2 < 
ao < 3/4), Lam [125] (o0 = 3/4), and Matsumoto-Meurman [140III] (3/4 < o < 1), 
respectively. (In [140III], the formula for o = 3/4 was given with a slightly weaker error 
term O(T (log T)!/?).) 

To prove the result for 1/2 < o < 3/4, that is Fo (T) = O(T), Matsumoto-Meurman 
[140II] gave anew averaged version (somewhat similar to (3.13)) of Atkinson-type formula, 
which is proved by combining the methods of Meurman [142] and Preissmann [165] with 
some additional new idea. In the same paper [140II], the conjecture 


F,(T) ~ 42°f(20 —1)°T (; <o< 3) (5.7) 
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was proposed. There are several heuristic arguments which may suggest (5.7) (see [115] 
[136]). The reason presented in [140II] is the fact that E,(T) has the mean value —27¢ 
(20 — 1). This fact was discovered independently by Ivi¢ [68]; he proved 


: 5_ I 3 
| E,(t)dt = B(o)T + O(T4°) (; <o< 7) (5.8) 
2 


((3.39) of [68]). The expression of B(o) given in [68] is complicated, but it is actually 
equal to —27 (20 — 1) (see Appendix of Matsumoto-Meurman [{140II]). The above (5.8) 
is a direct consequence of 


T P + _g OO 5 
/ E,(t)dt = Boyt +2-4(=) y (—1)"o1~26(n)n°~4 (5.9) 
2 It 


n=] 


x sin (V8xnT — -) + o(T!-3° log T) 


((3.30) of Ivié [68]), which is the analogue of (3.23). We mention here that it might be 
better to define the “real” error term in (5.2) (resp. (1.7)) as Eg (T) + 22 €(20 — 1) (resp. 
E(T) — 2). The constant —27¢(20 — 1) (resp. 2) corresponds to —(1/2)¢(2o0 — 1) in 
(5.5) (resp. 1/4 in (2.1)). 

Matsumoto-Meurman [140III] proved that the formula (5.1) 1s valid for all o satisfying 
1/2 <o <1. Wheno > 3/4 the Voronoi-type formula for the Riesz mean of 01 ~2, (”) 1s 
applied in [140III], because Oppenheim’s series is divergent. It is again a certain averaged 
version of Atkinson-type formula from which the case 3/4 < o < | of (5.6) was deduced 
in [ 140111}. (Here we note that in the statement of Lemma 4 of [1 40III], 0 should be deleted. 
The author would like to thank Dr. Hideki Nakaya who pointed out this mistake.) 

As an extension of (5.7), the author proposed the conjecture that the error terms O(T) in 
(5.6) could be replaced by A2(o0)T + o(T) for 1/2 <o < 1, with a certain constant A2(c) 
(see [137]). A refined version is: 


Conjecture 1 The error terms O(T) in (5.6) could be replaced by 
Ax(o)T + O(T*~* (log T)®) (5.10) 
for 1/2 <a < 3/4, where C => 0, and by 
Ax(a)T + A3(a)T2~ 2" + O(T?-* (log T)©) (5.11) 
with a certain A3(a) for 3/4 <a < 1. 


The reason of the error estimates O(T?~2° (log T)© ) is the result (6.6) mentioned in the 
next section. The author proposed (5.10) first in correspondence, which is mentioned in Ivic- 
Kiuchi [74]. The conjecture (5.11) first appeared in [115] (though the term A3(a)T>/2~2° 
is missing there). Even the weaker form of the above conjecture 1s still open. 

The formula (5.4) obviously implies E, (T) = Q(T3/4-° ) for 1/2<o0 < 3/4. Ivié [68] 
improved this to Q4(77/4~°) with some information about local sign-changes of E,(T). 
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The best known (2-results at present are 


E,(T) = Q_{T4~° exp(C(loglog T)?~ 4 (log log log T)"~2)} 


: ;) a 
(5<0<; (5.12) 


Ey(T) = 24(T#~? (log T)°~?) (; <o< i) (5.13) 


(Ivic-Matsumoto [75]), exactly corresponding to (3.21), and 


(Matsumoto-Meurman [140III]). It is much more difficult to obtain any Q2-result in the strip 
3/4 <o <1. The only known result is 


E3(T) = Q((log T)2), (5.14) 


a direct consequence of the case 0 = 3/4 of (5.6). 
What is the real order of E,(T)? In view of (5.6) and the above Q-results, we may 
formulate the conjecture 


3 
q—-a+€ 1 3 
T4 (5 <o <7) 


; (5.15) 
tS (g <0 <1). 


E,(T) « | 


In Ivi¢-Matsumoto [75] this conjecture is stated, and also it is pointed out that if we assume 
the very strong conjecture that (€, 1/2 + €) would be an exponent pair for any e > 0, then 
(5.15) would follow. 

The critical behaviour of E,(T) ato = 3/4 is again clear in (5.15); it might suggest 
some unexpected properties of ¢(s). In connection with this observation, an interesting 
discussion concerning the Lindelof hypothesis is given in Ivié [71]. See also the final 
section of [136]. 

The proof of the conjecture (5.15) seems to be out of reach now. As for the upper bound of 
E, (T), Motohashi (unpublished) proved that (5.3) holds for any o satisfying 1/2 <o < 1. 
His idea, inspired by his own work [151] on the fourth power mean of f(s), is to use the 
weighted integral 


l 9) t \2 
— o+i(T+t))\?e a dt (A>O). 5.16 
hae / . ra ( I ( ) (5.16) 
Ivi¢ [68] combined Motohashi’s idea with the theory of exponent pairs, and obtained various 
improved upper bounds of E,(T). This direction was further studied by Ivi¢-Matsumoto 
[75] and Kacenas [94] [95]; for instance, we have 


2(1-o) 


E,(T) = O(T~ 3 (logT)3) (; ZiGe ? 


(Ivié-Matsumoto [75]) and 


E,(T) = orm 38+) (Lega 2h 
2 100 
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with 6 = o — 1/2 (Kacenas [95]). The latter is uniform in o, and exactly corresponds to 
Huxley’s bound (3.20). 

We conclude this section with mentioning the case o = 1. No analogue of Atkinson’s 
formula is known in this case. Starting from the simple approximate formula (1.2), 
Balasubramanian-Ivic-Ramachandra [7] proved the asymptotic formula 


I(T) = €(2)T — m logT + Ej(T) (5.17) 
with 

E\(T) = O((logT)3 (log log T)3). (5.18) 
The connection between E,(T) and Ey (T) 1s given by 


lim | £20) + Omye-1 2 = 22) 


eG (foo = | = ¢(2)T —xlogT 


(Ivic [70]). In [7], they also proved the mean value results 


T ~— 
/ E\(t)dt 
2 


T 
| E\(t)"dt = O(T (log log T)*) 
2 


O(T), 


and conjectured that the latter integral would be asymptotically equal to CT. 

To show the estimate (5.18), the method of I.M. Vinogradov and Korobov, based on 
the deep theory of I.M. Vinogradov on the estimation of exponential sums, 1s applied. 
In fact, it is noted in [7] that from (5.18) one can deduce the estimate ((1 + it) = 
O((log t)*/3 (log log t)!/3), which is very close to the sharpest known bound ¢(1 + it) = 
O((log t)?/ 3 ), obtained by the Vinogradov-Korobov theory (see Chapter 6 of Ivic [66]). 


6 Mean Values of A;_2, (x) and Ro(o + it; t/27) 


In Section 3 we explained that a guiding principle of the study of E(T) is to pursue the 
analogy with A(x). Similarly, it is useful to study the behaviour of Aj_2, (x), defined by 
(5.5), which is the object analogous to E,(T). 

We already mentioned in Section 5 that the Voronoi-type formula for Aj—2, (x) due to 
Oppenheim [161] was used in the proof of (5.1). By using the truncated Voronoi-type 
formula, Kiuchi [109] proved that 


. | 3 
/ Al~20 (x)*dx = By (a) X2727 =f O(X#-7+8) (5 <o< 7 (6.1) 
2 


with 


a (5-27): (5 +20) 
MO) = F365 — 406) 2 aie 
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It was already mentioned by Cramér [22] that the left-hand side of (6.1) is asymptotically 
equal to By(a)X°/*~*° for 1/2 <0 < 3/4. Meurman [143] refined (6.1) to obtain 


Xx Bi(a) X37 + O(X) (4 <o < +) 
| Aj-20(x)°dx = } BoX log X + O(X) (o = 3) (6.2) 
2 O(X) (¢ <0 <1) 


with Bo = ¢°(3/2)/24¢(3). The formula (6.2) gives the complete analogue of (5.6). Hence 
the following analogue of Conjecture | can be formulated. 


Conjecture 2 The error term O(X) in (6.2) could be replaced by 
Bo(a)X + O(X*~*? (log X)°) (6.3) 


for 1/2 <o < 3/4, and by 


Bo(o)X + B3(o) X27" + O(X2~* (log X)°) (6.4) 
for 3/4 <o < 1, with certain Bo(o), B3(0) and C > 0. 


Meurman first proposed (6.3) in correspondence, while (6.4) appeared in Kiuchi- 
Matsumoto [115], though the term B3(a0)X 5/2-20 ig missing there. 

In Section 4, we discussed the analogy between A(x) and R2(1/2+it; t/27). Wecan find 
that there also exists an analogy between A j_2, (x) and Ro(o + it; t/27) for 1/2 <o <1. 
Kiuchi [1131] proved that 


r Ci(o)T?-*% +001) (4 <0 <3) 
| |Ro(o + it; t/27)|"dt = 4 tCologT + O(1) (o = *) (6.5) 
: O(1) (3 <6 <1), 


where Co is defined by (4.5) and Cj (a) = (271)*?~ 2 Co /(3—4a ). This precisely corresponds 
to (5.6) and (6.2). 

A remarkable fact is that we can go further in this case. Now it is known that the terms 
O(1) in (6.5) can be replaced by 


+) 


1) (6.6) 


= 
= 


Cr(0) + O(T!~”° (log T)*) (5 <a 

Co(o) + Ci(o)T2-* + O(T'-* (logT)4) 3B <o 

with acertain constant C2(o). The author [137] showed (6.6) in the case of 1/2 <a < 3/4, 

and in the same paper the weaker result with the error estimate O(T!/4~°) was given for 

3/4 <o <1. The result of the form (6.6) for 3/4 < o < 1 is due to Kiuchi [11311]. The 

above (6.6) implies that the facts corresponding to Conjectures | and 2 are indeed true for 
R2(o + it; t/27). 

Higher moments of R2(o + it; t/277) have also been discussed. The results analogous to 

(3.15)—(3.18) for Ro(1/2 + it; t/27) were obtained by Kiuchi [110] and Ivié [69]. The k-th 
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power moment of R2(o + it; t/27), where k is a positive even integer and 0 < o < 1, was 
studied by Kiuchi-Matsumoto [115]. Their results especially imply that the transposing 
line for the k-th power moment is 0 = 1/4+ 1/k, unconditionally for k =2,4,6 and 8, and 
under a certain plausible assumption for any even k. 

In the case 3/4 < o < 1, the bound O(X) in (6.2) 1s not the best known result. Already 
in 1932, Chowla [18] proved the asymptotic formula 


is lL ica Gayo 5 By 
| arava = sts |) (Sa) X + O(X2~”* log X) 


n=1 
(; <o< 1), (6.7) 


which gives the partial solution of the case 3/4 < o < 1 of Conjecture 2. Recently, 
Yanagisawa [198] rediscovered (6.7) and also obtained more general results. The basic tool 
of both Chowla and Yanagisawa is a generalization of (2.14), that is 


A126 (x) = —G1~26 (x) — x17?" Gag 1 (x) + O(x2-?), (6.8) 
where 
4 X 
Ga(x) = 2" (=). 


(As for (6.8), see Kanemitsu [98].) An asymptotic formula for the mean square of A_} (x) 
was given by Walfisz [197]. 
On the other hand, as an analogue of (2.7), Lam-Tsang [126] proved 


‘ 5 94 (1=29)3-40) l 3 
| dg (x)dx = C(a)X* + O(X 2(3-26) log X) 5 <o< > (6.9) 
2 
where 
x 3 
dg (X) = | A} ~29 (x)°dx — By (a) X27” 
2 


and 
_ $(20)°6(3 — 40) 
12(27)3-4° ¢ (40) 


This result clearly implies the fact 


C(o) = (3 — 40) sin(270). 


6g (X) = Q_(X) (; <o< 3) (6.10) 
2 4 

which may be regarded as a support for the case 1/2 < a0 < 3/4 of Conjecture 2. The 

order of 6,(X) for 1/2 < ao < 3/4 is completely determined by (6.2) and (6.10). 

It is an interesting problem to prove the analogue of Lau-Tsang’s (2.7) or Lam-Tsang’s 
(6.9) for F(T) or F,(T). Another attractive problem is to search the analogue of the method 
of Chowla and Yanagisawa for the function E, (T) in the case 3/4 <o < 1; or at least, to 
find the analogue of (6.8) for E,(T). The last type of problem was sometimes mentioned 
by S. Kanemitsu in correspondence and oral communication. 
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7 Some Mean Value Results in Short Intervals 


We mentioned in Section | that Good [35] showed E(T) = Q(T!/4). He actually proved 
an asymptotic formula for the integral 


T 
| (EQ +U)—E(t)*dt (1<U «T?), 
0 
and the 2-result is its corollary. The same formula 1s also used in the proof of Heath-Brown 
and Tsang [58] mentioned in Section 3. 


Next, Jutila [84] studied a similar problem, but for short intervals, by using Atkinson’s 
formula. His result is 


T+H 
| (E(t + U) — E(t))*dt 
T 


T+H 
w) = l 
d(n)“n ‘| t2 
+ T 
U 


l 
= a 
+ O(T'**) + O(HU?T*) (7.1) 
forT >2,1<U <«T!/* <H <T. Note that the right-hand side can be estimated as 
K(HU+T)T*. (7.2) 


Jutila did not give the details of the proof; instead, he described the proof (based on the 
truncated Voronoi formula (2.4)) of the corresponding formula for A(x), that is 


X+H 
/ (A(x + U) — A(x))*dx 
xX 


+ O(X!*®) + O(HUZX®) (7.3) 


for X >2,1<U « X!/* < H < X. Moreover, Jutila raised the problem of extending 
(7.1) and (7.3) to higher power moments. In particular, he pointed out that 1f one could 
prove 


T 
/ (E(t + U) — E(t))*dt = O(T'**U*), 
2 
then the very important conjectural bound 


[kG+#) 


6 
dt = O(T't®) 


would follow. 
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A formula for Aj_2,(x) (1/2 < o < 1), analogous to (7.3), was recently obtained 
by Kiuchi-Tanigawa [116]; they actually treated a more general quantity which involves 
exponential factors. Yanagisawa [199] studied the same problem by the method similar to 
his another work [198] mentioned in the preceding section. The analogy of (7.1) for E,(T) 
(1/2 <o < 1) was given by Kiuchi-Tanigawa [117]. In [118], they studied the same type 
of short interval mean square of R2(o + it; at/22, t/27a@) for rational a. 

In [881], Jutila proved the estimate 


T+H 7 4 
| (E(t +U) — E(t))*dt = O((HU + T3U3)T*) (7.4) 
T 


for 1 < H,U < T, which improves (7.2) when U « T'/4. Jutila noted that (7.4) implies 


the estimate 
2 
T+T3 l 
—+it 
I, (5 


due originally to Iwaniec [79]. In fact, since 


t+U l 
f k(a+%) 
can be approximated by E(t + U) — E(t), (7.5) easily follows from (7.4) with H = T 2/3 
U = T*, by applying Lemma 7.1 of Ivi¢ [66]. 

Iwaniec’s proof [79] of (7.5) was really epoch-making, because it was the first successful 
application of Kuznetsov’s trace formula [122] to the mean value theory of zeta-functions, 
and was followed by many important works of Zavorotnyi, Motohashi, Ivic, Jutila and 
others in the fourth power moment theory. The full account of this theory is out of the 
scope of this article, but here we should mention Jutila’s alternative proof [89] of (7.5). The 
basic idea of Jutila is to transform a certain relevant exponential sum by using (3.9), hence 
it is under the same philosophy as [83] [87]. The remarkable feature of Jutila’s proof is that 
it only uses classical means, without the fancy tools of spectral theory. In Jutila’s proof, 
a lemma due to Bombieri-Iwaniec [13] plays an important role. This lemma contains the 
arithmetic essence of (7.5), which is included in Kuznetsov’s formula (or Kloosterman’s 
sum) in Iwaniec’s original proof. 

Extending the above idea, Jutila [88I, II] studied the integral of the type 


4 
dt = O(T3*®), (7.5) 


2 
du 


2 


J = my | ys d(m)g(m, v, yy) exp(27if (m, v, y-))| du, (7.6) 


M<m<M!' 


where M is a large positive number, M < M’ < 2M, V > O, the functions f and g 
satisfy certain regularity conditions, and y, (1 < r < R) runs over a well-spaced set of 
numbers lying in [0,1]. Jutila [881] proved a certain upper-bound of J, and from which he 
deduced (7.4) as well as its analogue for A(x). A further development of this method, with 
applications to Dirichlet L-functions, can be found in Jutila [90]. 
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Lastly in this section we mention briefly the theory of Titchmarsh series developed by 
Ramachandra and his colleagues. Here we do not give the definition of general Titchmarsh 
series. They are elements of a certain class of Dirichlet series, including ¢(s)* (for any 
positive integer k) as an example. In [169I], Ramachandra raised a conjecture on the 
lower bound of the mean square of Titchmarsh series over short intervals. Ramachandra 
(partly with Balasubramanian) wrote many papers ([9], [168]-[170]) on this topic, and 
finally, Balasubramanian-Ramachandra [12] (and Ramachandra [171]) solved completely 
the conjecture in a more precise form. This solution especially implies 


T+H 2k loglog T 
a ah (5 +it) dt > C,(log Hy" + 0( SE PE™ (og Hy) 


+ O((log H) =!) (7.7) 
for loglog T «< H < T, where 


4 oe (Tk+m\ _n 
o= saepa ll" po? Emer | 


Ramachandra [171] includes an interesting lower bound of the mean square of ¢(1 + it) 
over short intervals. Upper and lower bounds of the mean value 


T+H 
i |, 


were studied by Ramachandra [168]. 

In the present article we do not discuss the full details of the theory of Titchmarsh series. 
This theory includes the treatment of the mean value of |¢ (s)7* | with non-integral complex 
values of k, various (2-results, sign-change theorems on arg¢(s), and generalizations etc. 
The readers are referred to Ramachandra’s lecrure note [172]. 


dt 
s= s+it 


Seis 


8 Several General Principles and the Mean Square 
of Dedekind Zeta-functions 


In the previous sections we discussed mainly the mean square of the Riemann zeta-function 
and related problems. A very important direction of research is to generalize the obtained 
results to various other zeta-functions. From this viewpoint, it is useful to find some general 
principles to obtain mean value results. One of them is the following classical theorem of 
Carlson [15], which is a generalization of (1.1) due to Landau and Schnee. Let 


f(s) = Yaw : 


be a Dirichlet series, convergent in a certain half-plane. Assume that f(s) can be continued 
to a holomorphic (or with possible poles included in a fixed compact set) function of finite 
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order in the regiono > a+ ¢€ >a. Moreover suppose that 


T 
/ If(o + it)|*dt = O(T) (8.1) 
—T 


foro > 09 > a. Then Carlson’s theorem asserts that 


T oe) 
| If(o + it)\?dt ~ 2 (> at) T (8.2) 
=7 


n=] 


foro > oo. (The part of the range of integration near the poles is omitted.) Potter [163], IT] 
studied this matter further. Potter’s results are especially useful when f(s) can be continued 
to the whole plane and satisfies a certain functional equation. 

Carlson’s theorem can be applied to the mean square of the Dedekind zeta-function fx (s) 
attached to an algebraic number field K, and the result is 


T OO 
| ICx(o + it)\?dt ~ (doex ima) T- 4@SpeT"y, 
I n=1 


where / = [K : Q] > 2 and ax(n) is the number of integral ideals in K with norm n. But 
this is not the best known result. Chandrasekharan-Narasimhan [16] developed a general 
theory of approximate functional equations, and the following is a consequence of their 
theory: 


T OO 
/ k(o + inPdt = bp axiny?n) T+ O(T-* (logT)?) (8.3) 
n=] 
ifo > 1—I17!, and 


T 
| ICx(o + it)\*dt = O(T'"-™ (log T)') (8.4) 
1 


if 1/2<o <1 —I7—!. When/ = 2, that is the case that K is a quadratic field, (8.3) gives 
the asymptotic formula foro > 1/2. In the case of 0 = 1/2, (8.4) gives the upper-bound 
O(T log? T). At the end of their paper [16], Chandrasekharan-Narasimhan conjectured 


T 1 
| ix(5 +it) 


for any real quadratic field K, with a certain constant Cp. 

Let D be the discriminant of a quadratic field K, x p the Dirichlet character defined as the 
Kronecker symbol xy p(n) = (2), and L(s, xp) the corresponding Dirichlet L-function. 
It is well-known that (x (s) = ¢€(s)L(s, xp), hence the mean square of (x (1/2 + if) is 
a generalization of the fourth power moment of €(1/2 + it). As for the latter problem, 
Titchmarsh [187] proved 


CK 


2 
dt ~ CT log’ T (8.5) 


] 
logt — (8.6) 
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as 6 — QO. This immediately implies 


[hee 


because there is the general principle that if f(t) > O for all ¢ and 


4 
1 

dt ~ —~—T log“ T, 87 
pee (8.7) 


(8.8) 


ve ot I 
tye "dt ~ -—log” — 
| fie 5 08" Ss 


as 6 — O, then 
T 
| f(t)dt ~ T log” T (8.9) 
0 


(see Section 7.12 of Titchmarsh [190]). This principle is, in a sense, a kind of Tauberian 
theorem. Following the idea of Titchmarsh, Motohashi [147] proved that 


[ lew(S +H) 


thereby established the conjecture (8.5). He found that 


=| 
C) = S01, x0) T] (1+ =) 


p|D 


re Ge tee ata, ee (8.10) 
age er : 


Another useful general principle is the mean value theorem for Dirichlet polynomials. 


For any complex numbers aj, ..., ay, we have 
2 
T . 
/ So anni} dt=T ~ lan\? +O Y° nian? |. (8.11) 
n<N n<N n<N 
(This remains valid for N = ov, if the series on the right-hand side converge.) The 
formula (8.11) 1s due to Montgomery- Vaughan [146], and the key of their proof is the 
following generalization of Hilbert’s inequality: Let A,,..., XR be distinct real numbers 
and 6) = Minn ¢n |Am — An|. Then, for any complex numbers a), ...,ar, we have 
Ama 31 7 
EY | < SY tania," (8.12) 
mZzn n 


This inequality has a close connection with the theory of large sieve inequalities; see 
Montgomery [144] [145]. 

Using (8.11), Ramachandra [166] gave a simple proof of Ingham’s (1.11). Applying the 
same idea to €x (1/2 + it), 1t is possible to prove 


[le(t+8) 


for a quadratic field K. 


Z 
dt = C2T log? T + O(T log T) 
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A further refinement was done by Muller [156], who generalized Heath-Brown’s proof 
of (1.20). His result is that 


: (G+i)e(s +i ) 
. ¢ ee pt ee 


+ O(get?T St), (8.13) 


2 
dt = CoT log? T + CiT logT + CoT 


where x 1S a primitive Dirichlet character (mod gq > 2), and C> is similar to C2, just 
replacing L7(1, xp) by [L(1, x)|?. 

Another possible way is to generalize the recent spectral-theoretic developments of the 
fourth power moment theory of f(s). See Motohashi [154], in which a certain explicit 
formula 1s given. 


9 L-functions Attached to Cusp Forms 


In the preceding section we discussed the analogy between fx (s) for quadratic fields and 
¢°(s). Another class of Dirichlet series, which may be regarded as an analogue of £7(s), 
is L-functions g(s. F) attached to holomorphic cusp forms, defined by (4.3). The function 
y(s, F) 1s convergent absolutely foro > (« + 1)/2, and can be continued to an entire 
function. The critical strip is (k — 1)/2 <o < («x + 1)/2. In this section we survey the 
results on the mean square of g(s, fF). Some of the quoted papers actually study more 
general cases (e.g. congruence subgroups). but here we restrict ourselves to the case of the 
full modular group for simplicity. Also we assume that F(z) 1s a normalized eigenform 
(i.e. a simultaneous eigenfunction of Hecke operators with a(1) = 1). 

The connection between F(z) and g(s, F) was established by Hecke in 1936-37. Just a 
few years later, the mean square of g(s, F’) was already studied by Potter [163], II]. Let 


E 
ie, ff) = | lp(o + It, F)|*dt. 
0 


As a consequence of his general theorem, Potter [1631] proved 


OO 
LT. F)Y* enn) T (c % a} (9.1) 
(2 : 
and then he [16311] proved that [,/2(T, F) = O(T log T). 

In the middle of 1970s, Good began deeper investigations of /, (T, F). First, Good [32] 
applied Titchmarsh’s idea of using the Tauberian principle (8.8)—(8.9) to the present case, 
and obtained the asymptotic formula 


[,j2(T, F) ~ 2x AoT log T, (9.2) 
where 


_ 12(4n)*7! 


Ape F(x t+ iy)? y*~*dxdy, 
0 wi Lt yy y 
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the integral being taken over a fundamental domain F of SL(2, Z). The constant Ap appears 
in Rankin’s celebrated formula 


ya (n) = Agx” #OG5), 


n<x 


which is essentially used in Good’s proof of (9.2). 

The next paper [33] of Good gives a certain approximate functional equation for g(s, F). 
Let w : [(0, 00) — R be a (fixed) C™-function such that w(p) = 1 if 0 < p < 1/2 and 
w(p) = Oif p > 2. Define wo(p) = 1 — w(1/p). Then, a useful form of Good’s formula 
can be stated as 


J 
g(s, F) = Ene [t| )Yoaoe ‘w(=)(- =) 


y 


K ee LASS) _ 
+ (-1)3 (22)? a one —s, [t(74) 


S—K wy! lt, ! 
«Zao (S\(-§) 


n=l 


K+ 


+ O(\jo"T? yy, 7 


—O 


oar! —a 
It1-3) + Ola! t? Iy? y2 17 2) (9.3) 


for] > (k + 1)/2 and An*y1y2 = t?, where || - ||; means the L'-norm and yj(s, Ir}~!) is 
a quantity defined by a certain integral. Note that yo(s, |t}~!) = 1. It is possible to deduce 
the approximate functional equation of classical type (like (1.4)) from (9.3), but the above 
form is more effective in applications. Using (9.3), Good [33] proved 


2K Aol log T + O(T) (o = *) 
I(T. FY =} (De eon VT + O17) Eco < Ft) 0.4) 
(TX, a(nyn7) T+ Oo? T) = ( = 4), 


The formula (9.3) and its relatives are fundamental in Good’s theory. In [34], Good 
proved a formula of the same type for €(s), and from which he [34] [35] deduced several 
new facts on E(7T) mentioned in Sections | and 7. A discrete mean square of g(s, F') was 
studied by Good [36], as an application of (9.3). As for J, (T, F), the next step of Good’s 
research is [37], in which he proved 


[.j2(T, F) = 2k AoT logT + AiT 4+ E(T, F) (9.5) 
where A, is a constant, with 
E(T, F) = O(T#(logT)*). (9.6) 
Moreover he showed that a certain non-vanishing assumption would lead to 


E(T, F) = Q(T?). (9.7) 
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To prove these results, Good applied (9.3) to the integral 


oe K t 
Jr(T, u)= | o(5 +H F) vu (=)ar 
—0o 


with a certain weight function Vy, and obtained an explicit formula for Jr (T, U), in which 
a sum involving the factor a(l)a(1 + n)(l + n/2)~* appears. To analyze this sum, Good 
studied the behaviour of the Dirichlet series 


y-a(a(l +m i a (9.8) 


| 


by using the spectral theory. He found an explicit expression of J-(7, U) written in terms 
of non-analytic Poincaré series and Fourier coefficients of Maass wave forms, from which 
he derived the above results (9.5)—(9.7). 

In [38], Good developed his theory further, and improved (9.6) to 


E(T, F) = O(T3(log T)°) (9.9) 


with C = 2/3. The bound 


o(5 + it, F) = O(|t|3(log {t|)*) — (It} = 2) (9.10) 


is an immediate consequence of (9.9). 
Recently, Kamiya [96] generalized (9.3) to the case of 


y(s, F, x) ae (9.11) 


nN 
nl | 


with a Dirichlet character y mod gq, and used it to obtain 


T 


a f lp(o +it, F, x)\?dt « (q)T log(qT) (9.12) 
x modg ~*~ | 


if q « T, uniformly for «/2 — 1/log(qT) < o < k/2 + 1/log(qT) and q. Here, )-* 
denotes the summation running over primitive characters. This is the result corresponding 
to Montgomery’s estimate [144] for the fourth power moment of L(s, x). 

Another type of mean square of g(s, F, x) (including the non-holomorphic case) was 
discussed in Stefanicki [185] by a different method. See also Kamiya [97]. 

Jutila’s consistent principle is to develop the theory which may treat the both cases C(s) 
and g(s, F) simultaneously. In Chapter 4 of his lecture note [87], Jutila gave a proof of 
y(k/2 + it, F) = O(|t|!/+*), slightly weaker than Good’s (9.10), in such a unified way. 
The estimate 

T K 6 
| 0o(5 Lge F) dt = O(T7t®), (9.13) 
0 
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an analogue of (3.5), was also proved in the same chapter. Jutila proved those results by 
means of his transformation method, hence the arguments are of elementary nature. At the 
end of [87] Jutila proposed the problem of showing (7.5) and the corresponding estimate 


T4173 K 
| 7) cdi ee 
T 2 


an obvious corollary of (9.9), in a unified way. This problem was solved by Jutila himself 
in [88I], mentioned in Section 7. Hence [88]] includes a proof of (9.14) by classical means. 
An alternative proof of (9.13) is given in [88I]]. 

We sometimes mentioned the recent spectral-theoretic approach to the fourth power 
moment of ¢(s). In view of the analogy between ¢?(s) and y(s, F), one may expect, 
suggested by (9.9), that (1.20) could be improved to 


2 
dt = O(T3*®), (9.14) 


E,(T) = O(T3*®). (9.15) 


This was first achieved by Zavorotnyi [201] by using Kuznetsov’s convolution formula 
[124]. Ivi¢-Motohashi [77] gave an alternative proof, with replacing T° by a log-power. In 
the latter proof, Motohashi’s explicit formula [151] for the weighted fourth power mean of 
|¢(1/2 + it)| is essentially used. It is worth while noting that the basic idea of [151] is an 
extension of Atkinson’s dissection argument to the fourth power situation. 

We already discussed the close connection between E(T) and the Dirichlet divisor prob- 
lem. Similarly, E2(T) is closely related to the additive divisor problem, as was first 
noticed by Atkinson [2]. The additive divisor problem is the problem of evaluating the 
sum )_,-, d(n)d(n +r), and has a long and rich history. The associated zeta-function is 


OO 

Y\d(n)d(n +r)n*, (9.16) 

i= 
whose explicit spectral-theoretic expression was obtained by A.I. Vinogradov- Takhtadzhyan 
[194]. One may notice the similarity between (9.8) and (9.16), both of which were handled 
by spectral theory. Inspired by those works of Good and Vinogradov-Takhtadzhyan, and 
also inspired by the classical works of Titchmarsh [187] and Atkinson [2] on the Laplace 
transform of |¢(1/2 + it)|*, Jutila [91] developed a new unified approach to E2(T) and 
E(T, F). He obtained spectral-theoretic explicit formulas for both E2(T) and E(T, F), 
which, in the case of E2(T), has the same flavour as Motohashi’s explicit formula [151]. 
As a consequence, Jutila reproved (9.9) (with the factor T°) and (9.15). 

Another approach was given by Motohashi [152]. He sketched the way how to modify 

the argument in [151] to obtain an explicit formula for the weighted mean square of y(s, F), 
and to deduce from which the estimate (9.9) as well as the mean square estimate 


T 
| E*(t, F)dt = O(T7(logT)°). (9.17) 
0 
The latter is the analogue of 


LT 
/ E3(t)dt = O(T* (log T)°) (9.18) 
0 
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due to Ivié-Motohashi [76]. Jutila [93] pursued his approach via Laplace transforms further, 
and proved (9.17) and (9.18) in a unified way. 

An important advantage of Jutila’s method is that it may also treat the non-holomorphic 
case. The results corresponding to (9.9) and (9.17) (with replacing (log Pye by 7°) for 
L-functions attached to Maass wave forms are proved in [91] [93]. The former is an 
improvement of Kuznetsov’s result [123], which gives the exponent 6/7 + « in the error 
term. See also Miller [157] for another approach to the non-holomorphic case. 


10 Dirichlet L-functions 


Now we return to the GL(1)-situation, and in this section we discuss various mean square 
formulas for Dirichlet L-functions. Let x be a Dirichlet character mod g, and L(s, x) the 
corresponding Dirichlet L-function. A natural extension of J,(T) is the mean value 


le = 5 S° [ \L(o + it, y)|?dt. 


x modgq 


Serious research on this quantity was started by Ramachandra and his school in 1970s. It 
was mentioned by Ramachandra [167] that he had obtained the asymptotic formula 


1, (T.q) = PDT logiqT) + OT (ogg) 


It is easy to see that 


L6.w=¢" Y xae(s, “). (10.1) 


a=1\ 


where €(s,q) is the Hurwitz zeta-function defined by the analytic continuation of the 
Dirichlet series . ~_g(n + a)~*. Hence we can reduce the problem of evaluating J, (T, q) 
to the study of the mean square of ¢(s, ~). The approximate functional equation of f(s, @), 
corresponding to (1.4), can be stated as 


2 sti (Eat) —27mina 
= —S mee ol (4 
t(s,a) = SY m+a) +(=) > - 
O<n<é l<n<n 
+ O(E~° logt), (10.2) 


valid for 1 < € < n, 27En = t, andO < o < 1. Rane [173] proved a more precise 
approximate formula of the Riemann-Siegel type, and used it to prove 


T l . 2 
| ‘(5+ ie] 


where C(q@) is a constant depending on a. From (10.1) and (10.3) Rane [173] proved 


dt =TlosT +C(a)T ——+ O(a72T?2logT), (10.3) 
O4 


‘a lo 
eg p= 227 log S— +2v-1+ > =f + E(T,q) (10.4) 


Pig 
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with 


E(T,q) <« ae cr log T + logq). (10.5) 


Balasubramanian-Ramachandra [8] gave a simpler proof of (10.4) and (10.5). A key lemma 
in their argument is a short interval mean square estimate of €(1/2 + it, w), which is proved 
by the idea of Ramachandra [166]. 

The next step was due to Narlikar [160], who improved (10.5) to 


E(T,q) « ors "(log T)’. 


This is based on her refinement [159] of (10.3). Unfortunately Narlikar’s argument includes 
an error which leads to the existence of extra terms of the order T!/* in her statements, 
which are to be deleted. Zhan [202] mentioned that it is possible to correct this error 
and justify Narlikar’s argument. Zhan [202] himself adopted a different way; he proved 
the approximate formula of the type of Heath-Brown [53] for ¢(1/2 + it, aw), and using It 
he gave further improvements on the results mentioned above. In particular he obtained 
E(T,q) = O(T°%**) with a certain a < 1/3. Other variants of approximate functional 
equation for f(s, a) were recently given by Rane [177] [178]. See also Balasubramanian- 
Ramachandra [10]. 

Meurman [141] proved a generalization of Atkinson’s formula to E(T,q), and from 
which he deduced 


o(qg)(gT)3*® +q't*)  (q &T) 


= oF - (10.6) 
b(q) (qT)2"* +qT ) (¢ >T). 


E(T,q)<« | 


Laurincikas [130] proved an analogue of Meurman’s formula near the critical line, while 
the analogue for fixed 0, 1/2 < o < 1, was obtained by Nakaya [158]. Meurman’s paper 
[141] includes the short interval estimate 


l 
L{ —+it, 
(p+) 


for H > 1, asacorollary. An alternative proof of (10.7) was obtained by Balasubramanian- 
Ramachandra [11], in which further improvements by using the theory of exponent pairs 
were also discussed. It is to be noted that their argument includes, as a special case, a simple 


proof of the estimate 
T+T'/3 1 
= it 
i bG+) 


The study of the mean square of individual L-functions 


°T+H 


J 


xX modgq 


2 
dt <(qH + (qT)3)\(q(T + HD)’. (10.7) 


2 


T 
| |L(o + it, x)|?dt 
O 
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is more difficult. See the recent article [108] of Katsurada and the author, in which the 
history of this problem is sketched. 
Another version of mean square of L-functions is 


U(s,q) = IL(s, x)”. 


= 
$(q) 


x modg 


In the case of s = 1/2 + it, t > 2, Gallagher [29] proved 


u(S +i \«z5' + t)log(gt) (10.8) 
EAE Gray pase : 


Improved upper-bounds were obtained by Meurman [141] and Rane [176]. Balasubramanian 
[6] gave an asymptotic formula with the main term q~!@(q) log(qt), which was further 
refined by W. Zhang [205] [214] and Yu [200]. In [214], it is shown that 


1. \  $@) qt log p 
u(5 +i] = ; {ioe (2) +27 + ee 
P\q 
+O (sae + qty} exp (ee )) . (10.9) 


(q) (q) log log(qt) 


In most of the above works, the problem is reduced by (10.1) to the mean square of Hurwitz 
zeta-functions. And for the latter problem, for example in [214], it is shown that 


q 


2 


a=] 


I a ? qt | 
a Gor ot Le =q}{log{— ])+2y}+ O(qt + (qt)2 logt) (10.10) 
2 q 20 


(a special case of Lemma 7 of [214]). W. Zhang’s papers [206] [215] are devoted to the 
study of Dae, |L(1/2 + it, x)|?, while in [204] he obtained an asymptotic formula for 


q lL , 
ito. TAz& 
ENE (; oe x) 


xX modq 
On the other hand, Motohashi [1491] applied Atkinson’s dissection argument (see 
Section 3) to the study of U(s, q), and obtained an asymptotic formula for U(1/2+ it, p) for 
any prime p and fixed t. Motohashi’s idea was further developed by Katsurada-Matsumoto 
[103], who proved the following formula: 


ht $(q) q log p cay as re 
Ul aig) a4 log Re—(— +i 
(543 a) ; {tor + ae Decca bas e=(5 +! 


Plq 


ps q Dy tebe 
+2 u(2)r(5 + uk), (10.11) 


k\q 


2 


q<Q 
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where ,z(-) denotes the Mobius function and T (1/2 + it; k) satisfies the asymptotic formula 


1 ofa on ara) ae 1 I 
be a 2 +it—n i Po peeone 
(5 +ink] = ke | Et hn jt (5 +i n)e(; +0)| 
Ok hr) (10.12) 


for any positive integer N, where the O-constant depends only on N. The quantity 7(1/2+ 
it; 1) can be written down in a closed form, while (10.12) gives the asymptotic expansion 
of T(1/2 + it; k) with respect tok if k > 1. Hence, if g = p” is a prime power, then from 
(10.11) we can deduce the asymptotic expansion of U(1/2 + it, p’”) with respect to p. The 
special case t = 0 of (10.11) was first obtained by Heath-Brown [54] by a different method, 
but the coefficients of the expansion are not explicitly given there. J. Zhang-Xing [203] 
gave an alternative proof of (10.11) by using Hurwitz zeta-functions. Another different 
proof of (10.11) was recently obtained by Katsurada [101]. Rane [175] also gave a similar 
expansion, but the coefficients are not explicit. 

The mean square of (d/ds)L(s, x) was considered by W. Zhang [211] [212] and Chen 
[17]. Katsurada [99IIT] generalized the method of [103] [99IT] to study the case of (d* /d sk) 
L(s, x) for any positive integer k. 

In [103], the asymptotic expansion formula was proved not only foro = 1/2, but for any 
o satisfying 0 < oa < N +1, as was pointed out in Katsurada-Matsumoto [106]. The region 
was further extended by Katsurada [99II] [101]. The formula (10.11) can be derived from 
this general formula as the limit case 0 — 1/2. Another important limit case is 0 = 1, and 
in this way we can deduce a precise formula for 


Vq)= >, ILC, x)? 
xmodq 
XFXY) 


where xo is the principal character mod g. Evaluation of V(q) is a classical problem, and 
in the case that g = p is a prime, the formula 


V(p) = £(2)p + O(log? p) 


goes back to Paley [162] and Selberg [181]. This was refined by Slavutskii [183] [184] and 
then W. Zhang [209] [210]; the result given in [210] is that 


l 
V(p)=¢(2)p - log’ p +C+ o(—). 
Og P 


Following the above mentioned method, Katsurada-Matsumoto [106] proved the asymptotic 
expansion 


1 
V(p) = €(2)p— log? p + (vy? — 2y, — 3¢(2)) — (vy? — 2 ~ 2Q))- 


N-1 


I 
+ 2(1 = 4 core —n)t+n)p "+ oun") (10.13) 


n=l 
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for any positive integer N, where y; is defined by the following Laurent expansion of ¢(s) 
ats = 1: 


I = , 
t(s) = a es I, wry. (10.14) 
j=0 


The formula (10.13) gives the satisfactory answer to the problem of evaluating V(p). 
Moreover, in [106], a generalization of (1.0.13) to any composite g is proved. Some explicit 
expressions of U(m, gq), where m(¢ 1) is an integer, are also obtained in [106]. 


11 Hurwitz Zeta and Other Related Zeta-functions 


We already mentioned some mean value results on ¢(s, a) in the preceding section (see 
(10.3) and (10.10)). In this final section we mainly discuss the approach by Atkinson’s 
method, due to the recent papers of Katsurada and the author. 

First, it is easily seen that a simple modification of the method developed in Katsurada- 
Matsumoto [103] can be applied to the discrete mean square ar \¢(s,a/q)|*. The result 
is an asymptotic expansion formula similar to (10.11) and (10.12), proved in Katsurada- 
Matsumoto [104]. This should be compared with (10.10). 

A more interesting problem is to evaluate the integral 


1 
H(s) = | Ie1(s, @) (da, 


where €1(s,a@) = €(s,a@) —a *. Inthe case of s = 1/2 + it, t > 2, this problem was first 
considered by Koksma-Lekkerkerker [119], who showed H(1/2 + it) = O(logt). This 
result was used in Gallagher’s proof of (10.8). Balasubramanian [5] proved the asymptotic 
formula 


l 
a(; + 2 = logt + O(log log fr), 
and further refinements were done by Rane [174], Sitaramachandrarao (unpublished), and 


W. Zhang [207] [213], by using the approximate functional equation (10.2). Zhang [213] 
arrived at the result 


l t 
H( —+it) =log(—)+y 4 0 ®(logr)i), 
2 2m 
and conjectured that the error estimate could be improved to O(t7!/*). Ramachandra 


independently expressed the same opinion. This conjecture was solved by Andersson [1] 
and Zhang himself [217], independently of each other, in the following unexpected form: 


l t I it) 
H = +i) = log | — Pip Re 2 aga ly (11.1) 
2 20 5 tit 
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Shortly after their works, Katsurada-Matsumoto [105] [107I] obtained the following 
asymptotic expansion. For any integer K > 0, it holds that 


1. rfl. (4 +it)-1 
H(—+it) = Re—(=+ir)+y —log2x — 2Re—+~———_ 
2 r 2 a 
(—1)k¥-l(k — 1)! —~ ; 
= 2Re Sa a yok t+ jy 3tk-it 
1 (5 —k+ir)(3 —k+it)...4g +) 77 
ae ey, (11.2) 


The starting point of the proof is Atkinson’s dissection device (cf. Section 3). For Reu > 1, 
Rev > 1, we have 


C(u,a)o(v,a) = C(utv,a)+ f(u,v,a) + f(v, u; a), 


where 


f(u,v;a@) = Lmtar ‘SS anaes (11.3) 


n=l 


By the argument similar to [1491] [103], we can prove a contour-integral expression of 
f(u, v; a), which gives the analytic continuation. Analyzing this expression further, 
Katsurada-Matsumoto [1071] obtained the following formula, which is fundamental in their 
theory. Let N be a positive integer, (s), = (s +n)/T(s) the Pochhammer symbol, and 
define 


Sv (u,v) = >, a 6 +n)—1) 
and 
Ty (u,v) = 20M, 3 Sty [ea + B) “dB. 
(=v & 


Then it holds that 


1 
| f1(u,a)oi(v, a)da = 


utu—! 
BD i (7 oe Tee bY re p(T +) 
(wu) P(v) 
— Sn(u,v) — Sn(v, u) — Ty u, v) — Ty (0, u) (11.4) 


for -N +1 < Reu < N+1,—-N+1 < Rev < N +1 and (u,v) ¢ E, where E is 
the set of (u, v) at which some factor in (11.4) has a singularity. We can derive a formula 
for (u, v) € E asa limit case. For instance, (11.2) follows easily from the case N = 1 of 
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(11.4), by integrating 7) (u, v) and 7;(v, u) by parts K -times and taking u — 1/2 + it and 
v — 1/2 —it. Explicit expressions of H(1 + it), and of H(m) for any integer m(4 1), 
can also be deduced from (11.4) (see [1071] and [107II], respectively). Letting N — ov, 
and then u —> 1/2 + it and v > 1/2 — it in (11.4), we obtain 


l i /l +n- it 
H(5+ir) =ReW-(; tit) +y- log 27 — 2Re ars (11.5) 
2 Tr \2 a S+n-+it 


due originally to Andersson [1]. (The special case t = 01s included also in W. Zhang [217].) 
Katsurada [100] presented an alternative proof of (11.4). His proof uses the Mellin- 
Barnes type of integrals and properties of hypergeometric functions, hence it is under the 
same principle as his [101]. Actually Katsurada’s paper [100] treats a more general situation, 
that is the mean square of Lerch zeta-functions defined by the analytic continuation of 
ee e2TiAn(n 4 @)~5, where a > O and A is real. He obtained various asymptotic 
expansions, which includes a refinement of W. Zhang’s former result [216] [218]. 
Next we consider the derivative case 


1 
Ay(s) -| 
0 


The case k = | was studied by W. Zhang [208] and Guo [40] [41], and it is shown that 


Ail sen = ie = ate (— 
~+it) = log’ { — og” ( — 
\5 ee OF eee Or 


+ 27; toe (5 ) + 272 + Ot! log? t) (11.6) 


2 
da. 


dk 
Fe, 


in [40] [41]. On the other hand, as was first noticed by Katsurada [99III], the method based 
on Atkinson’s dissection device is suitable to study the mean square of higher derivatives. 
The idea 1s, roughly speaking, to differentiate (11.4) k-times with respect to both u and v 
and analyze the resulting expression carefully. The result is that 


2k 
I 1 t (2k)! _.(t 
A,{-+it) = log2kt! f = Ale one Gey gs cae fg (eee 
(5+) 7 ee ade On +L aK” On 
kc (ds it 
— 2Re aE Sinlisy 
(5 + it)kt! 


+ O(t~*(logt)**) (11.7) 


for any k > 1, where y; is defined by (10.14). The case k = | gives a refinement of 
(11.6). The formula (11.7) was announced 1n [107II], and the detailed proof is described in 
[1O7IIT). 

Finally we mention the author’s work [139] (already announced in [138]) on the double 
zeta-function 


CO CO 


02(s;a,w) = Y> i(@tmtnu) (11.8) 


m=0 n=0 
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of Barnes, where a > 0, w > O be parameters. Inspired by the similarity of (11.3) and 
(11.8), the author introduced the generalized double zeta-function 


fo(u, v;a,w) = ) (a+m)" Yia@tm+nw)’, 


m=0 n=1 


and applied the method similar to that in [1071] to Ou, v;a,w). Then we putu = 0,v =s 
to obtain the asymptotic expansion of f2(s; a, w) with respect to w. Certain asymptotic 
expansions for double gamma-functions, and for the value at s = | of Hecke L-functions 
of real quadratic fields, were also obtained in [139]. Recently, Katsurada [102] introduced 
another generalization 


Yim +a)" dim t+ntat Bp)”, 


m=0 n=0 


where a, B are positive, and studied its properties by using the Mellin-Barnes type of 
integrals. He actually considered a more general series involving exponential factors. 

The results mentioned in the last two sections show that Atkinson’s method 1s indeed 
useful in a much wider area than was expected before. 

The readers probably find that the recent developments in the mean square theory are 
really impressive. However, the mean square theory has been by no means exhausted; there 
remain many unsolved problems and uncultivated areas. It will still be one of the main 
streams in zeta-function theory, and fascinating new methods and results will surely appear 
in the coming century. 
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Algebraic Curves Over Finite Fields with many 
Rational Points and their Applications 


Harald Niederreiter and Chaoping Xing 


Algebraic curves over finite fields with many rational points have received a lot of attention in recent years. 
We present a survey of this subject covering both the case of fixed genus and the asymptotic theory. A strong 
impetus in the asymptotic theory has come from a thorough exploitation of the method of infinite class field 
towers. On the other hand, we show by a counterexample that Perret’s conjecture on infinite class field towers 
is wrong, and so Perret’s method of infinite ramified class field towers breaks down. In the last two sections 
of the paper we discuss applications of algebraic curves over finite fields with many rational points to coding 
theory and to the construction of low-discrepancy sequences. 


Key Words. Algebraic curves over finite fields, rational points, global function fields, 
rational places, algebraic coding theory, low-discrepancy sequences. 


1 Introduction 


Let g be an arbitrary prime power and let F, denote the finite field of order g. By analgebraic 
curve over Fg we always mean a smooth, projective, absolutely irreducible algebraic curve 
defined over F,. If C is such a curve, then we write g(C) for the genus of C. A point of C 
is called F,-rational if it has homogeneous coordinates which all belong to F,. Let N(C) 
denote the number of F,-rational points of C. The following definition is basic for this paper. 


Definition 1 For any prime power gq and any integer g > 0 put 
Ng(g) = max N(C), 
where the maximum is extended over all algebraic curves C over Fg with g(C) = g. 


The calculation of Ng(g) is a very difficult problem in algebraic geometry and exact 
values are currently known only in some isolated cases (See Section 2). Mostly, we have to 
be satisfied with lower and upper bounds for N,(g). A well-known general upper bound 
for Nj (g) is the Weil-Serre bound 


No(g) <q +14el2q'], (1) 


where |u| denotes the greatest integer not exceeding the real number uw. Further information 
on upper bounds is provided in Section 2. In an informal way, we say that an algebraic 
curve C over F, of genus g has many rational points if N(C) 1s reasonably close to Ng (g) 
or to a known upper bound for Ng (g). 
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It will often be convenient to replace the language of algebraic curves over finite fields by 
the equivalent language of global function fields, 1.e., of algebraic function fields with finite 
constant fields. For an algebraic curve C over Fy, the field K of F,-rational functions on C 
is a global function field with full constant field F,, that is, with F, algebraically closed in 
K. We use the notation K /F, if we want to emphasize the fact that F, is the full constant 
field of K. With every global function field K/F, we can also associate an algebraic curve 
C over F,. By a rational place of K we mean a place of K of degree 1. We write g(K) for 
the genus of K and N(K) for the number of rational places of K. In the correspondence 
between the curve C and its function field K, the closed points of C can be identified with 
the places of K and the F,-rational points of C can be identified with the rational places 
of K, so that N(C) = N(K). Furthermore, we have g(C) = g(K). In particular, we can 
interpret the quantity Nj(g) in Definition | as the maximum number of rational places that 
a global function field K/F, of genus g can have. In analogy with a corresponding way 
of speaking for algebraic curves, we say that a global function field K/F, of genus g has 
many rational places if N(K) is reasonably close to Ng(g) or to a known upper bound for 
Nq(g). 

In Section 2 we review some known results on N,(g) and in Section 3 we discuss the 
asymptotic behavior of Ng (g) as g — oo. The conjecture of Perret [19] on infinite class field 
towers, which arose in the asymptotic theory of N,(g), is disproved by a counterexample 
in Section 4. As a consequence, Perret’s method of infinite ramified class field towers 
breaks down. A survey of constructions of algebraic curves over finite fields with many 
rational points is given in Section 5. Applications of such curves to algebraic coding theory 
and to the construction of low-discrepancy sequences are discussed in Sections 6 and 7, 
respectively. Tables of lower and upper bounds for N,(g) and of an interesting quantity 
from the theory of low-discrepancy sequences are included. 


2 Results on Nz, (g) 


General formulas for Nz (g) are known only for small values of the genus g. It is trivial that 
Ng(Q) =q +1. A formula for Nz (1) is essentially due to Deuring [1]; see also Waterhouse 
[30]. We have Nj (1) = g + 1 + [2q!/*], except in the case where g = p® with a prime 
p dividing ag'/4| and an odd integer e > 3, in which case Ng(1) = q + |2q'/* |. Serre 
[20], [21], [22] determined the values of N,(2). Write again g = p* with a prime p and 
an integere > 1. If e is even andg # 4,9, then Nj,(2) = gq+1+ 4q\/*, whereas 
N4(2) = 10 and No(2) = 20. We call q special if either p divides |2q'/| or q is of the 
form m* +1, m* +m + 1, or m7 +m +2 for some integer m. If e is odd and q is not special, 
then Ng(2) =q+1+ 212g |: If e is odd and q is special, then Ng(2) = q + 2120" | 
org + 22a") — 1, depending on whether the fractional part {2q\/7} is greater than 
(/5 — 1)/2 or not. 

An algebraic curve C over F, of genus g is called maximal if N(C) =q+1+ 29q'/2, 
Clearly, maximal curves can exist only if g = O or if g is asquare. If a maximal curve over 
F, of genus g exists, then Ng (g) = q+1+2¢q 1/2 that is, we have equality in the Weil-Serre 
bound (1). A well-known family of maximal curves is given by the Hermitian curves. If 
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q =r? isasquare, then the Hermitian curve H over F, is defined by x7t!+y"t!+42"+! — 0, 
This curve satisfies g(H) = r(r —1)/2 and N(H) = r>+1. Furthermore, for any maximal 
curve C over Fy with g = r* we have g(C) < r(r — 1)/2. For recent work on maximal 
curves we refer to Fuhrmann and Torres [2] and Xing [31]. 

If the genus g is sufficiently large relative to g, then the Weil-Serre bound (1) can be 
improved by using Weil’s explicit formula for N(C) in terms of the eigenvalues of the 
Frobenius of the algebraic curve C over F,. A general refinement of (1) was established by 
Serre [20]. Let 


m 
f@=1 +2) en cos nd 
n=1 
be a trigonometric polynomial with coefficients c, > 0 that are not all O and with f (6) > 0 
for all real 6. Then 


go b(q\/?) 
~ b(q7!/*) — b(q7'/2) 


where b(t) = es 1; Cnt”. Another proof of this bound can be found in [23, Section V.3]. 
The quality of the bound (2) depends on the choice of the trigonometric polynomial f, and 
the question of this choice was investigated by Oesterlé (see Serre [22]). 

At the end of the paper we provide two tables of bounds for Ng(g). Table 1 is for 
q = 2,3,4, 8,9, 16,27 and 1 < g < 50 and for g = 5 and | < g < 22. Table 2 extends 
Table | for the importantcaseg = 2totherange51 < g < 95. Ineachentry of the tables, the 
first number is a lower bound for Ng (g) and the second is an upper bound for N, (g). If only 
one number is given, then this is the exact value of Ng (g). A program for calculating upper 
bounds for Ng (g), which is based on (2) and the trigonometric polynomials of Oesterlé, was 
kindly supplied to us by Jean-Pierre Serre. Tables 1 and 2 are reprinted from [18, Table 3] 
and [{15, Table 2], respectively. 


Nq(g) A, (2) 


3 Asymptotic Theory 


The asymptotic theory of algebraic curves over finite fields with many rational points is 
concerned with the behavior of N,(g) for a fixed prime power g and g — ov. The basic 
quantity here is given by the following definition. 


Definition 2 For any prime power q put 


A(q) = lim sup ——— Na(8) 
800 
where g runs through positive values. 


It follows from (1) that A(g) < |2g I/ ae Ihara [4] obtained some improvements on this 
bound and showed also that A(q) > q!/* — 1 if g is a square. VlAdut and Drinfeld [29] 
proved the currently best general upper bound 


A(q) < q\/* —1 forall g. (3) 
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This result can be derived quite easily from (2), as demonstrated in [23, Theorem V.3.6]. 
As aconsequence, we have A(q) = q!/7 — 1 if g 1S a Square. 

In the case where gq 1s not a square, no exact values of A(g) are known, but lower bounds 
are available which complement the general upper bound (3). Serre [20], [22] showed by 
methods of class field theory that A(q) > clogg with an absolute constant c > 0, and an 
effective version and an alternative proof of this result were recently given by Niederreiter 
and Xing [16]. Better lower bounds are known for special nonsquares qg. For instance, Zink 
[35] proved that 

p: 
A(p?) > cd aaa for all primes p. 
p2 


More generally, for composite nonsquares qg the following bounds in Theorem 1 below 
were recently established by Niederreiter and Xing [13], [15]. It is convenient to state these 
results as bounds for A(g™). For areal number u we write [u] for the least integer > u. 


Theorem 1 Let g be a prime power and m > 3 an odd integer. Then 


2q+2 
m . . , 
A(q’) = 720g 43) 741 if q is odd, 
] 
A(q”") = cleo if q > 4 is even. 


[2(2g + 2)1/2] 4.2 


If g = p* witha prime p and an odd integer e > 3, then Theorem | yields a lower bound 
for A(q) of the order of magnitude g!/“*"), where m is the least prime factor of e. Further 
refinements of the bounds in Theorem | can be found in Niederreiter and Xing [16]. The 
following result of Niederreiter and Xing [13] yields the currently best lower bounds on 
A(q) for g = 2, 3,5. 


Theorem 2 A(2) > #4 = 0.2555..., A(3) => $4 = 0.3803..., A(5) = 2. 


4 A Counterexample to Perret’s Conjecture 


Perret [19] described a method of obtaining lower bounds for A(q) which is based on infinite 
ramified class field towers. However, this method depends on a conjecture which would 
provide a sufficient condition for the infinitude of certain ramified class field towers. In this 
section we show by a counterexample that this conjecture is wrong. Before we can state 
Perret’s conjecture, we need further notation and terminology. 

Let K/F,, be a global function field, let D be a positive divisor of K, and let S be a 
nonempty set of rational places of K with S disjoint from the support supp(D) of D. For a 
given prime /, the (/, D, S)-class field K, of K is the maximal abelian extension of K (in 
a fixed separable closure of K) with a Galois group of exponent | or / such that the global 
conductor of K;/K divides D and all places in S split completely in K,/K. If 


D= J - mpP 
Peésupp(D) 
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with positive integers mp, then we define the positive divisor D; of K; by 


Di-= » So mpQ. 


Pesupp(D) Q|P 


We let S; be the set of all places of K; lying over those in S. Now we iterate the construction 
above by letting K2 be the (/, D;, S)-class field of K,, and so on. In this way we obtain 
the tower 

K = Ko © &y} CC &o Gis, 


which is called the (J, D, S)-class field tower of K. This tower is called infinite if K, 4 
K,,41 for all n > 0. The following result provides a lower bound for the quantity A(q) in 
Definition 2. 


Lemma 1 Let K/F, be a global function field with g(K) > 1, let p be the characteristic 
of Fg, and let D be a positive divisor of K whose support consists of a single place. If the 
(p, D, S)-class field tower of K is infinite for some nonempty set S of rational places of K 
with § disjoint from supp(D), then 


A( ) > ee 
47 = 39(K) + deg(D) — 2° 


Proof: The degree of the different Diff(K,/K) of the extension K,/K satisfies 
deg(Diff(K,/K)) < ((Kn : K] — 1) deg(D) 
by [19, Théoréme | and Proposition 4]. Therefore the Hurwitz genus formula yields 


2g(K,) -—2 = [Kn : K](2g(K) — 2) + deg(Diff(K,/K )) 
[K, : K](2g(K) + deg(D) — 2) — deg(D). 


LA 


Hence we get 


AG Site 2N(K,) 
im Su 
ie ee 22(Kn) 
2|S|[K, : K] 
2. he 
noo [K, : K](2g(K) + deg(D) — 2) — deg(D) + 2 
2|S| 


22(K) +deg(D) — 2 
CO 


Let K/F,, D, S, and / be as at the beginning of this section. For an abelian group G we 
write d;G for the /-rank of G. The following conjecture was stated in [19, Conjecture 1]. 


Perret’s Conjecture. /f 
l 
dG +|S| 5 7G) 
with G = Gal(K,/K), then the (1, D, S)-class field tower of K ts infinite. 
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Our construction of a counterexample to Perret’s conjecture is based on the theory of 
narrow ray class fields; see [3, Section 16], [8, Section 2] for convenient summaries of this 
theory. Let K /F, be a global function field with N(K) > 1 and distinguish a rational place 
oo of K. Let A be the ring of elements of K that are regular outside oo. The Hilbert class 
field Ha, is the maximal unramified abelian extension of K (in a fixed separable closure of 
K) in which oo splits completely. Let M be a nonzero integral ideal of A and A(M) the 
corresponding M-torsion module stemming from the action of a sign-normalized Drinfeld 
A-module of rank 1 defined over Ha. Then Ey = Ha(A(M)) is the narrow ray class field 
over K with modulus M. 

Now we take g = 2 and we consider specifically a global function field K/F 2 with 
g(K) = 6 and N(K) = 10; see [6, Example 6] for an explicit construction of such a 
function field. Distinguish a rational place oo of K and let the ring A be as above. By 
[34, Lemma 8] or by inspection, there exists a place P of K of degree 18. Now let Ey 
be the narrow ray class field over K with modulus M = P*. Note that the place oo 
splits completely in Ey /K by the theory of narrow ray class extensions. Furthermore, 
we have 


dyGal(Ey/K) > d2Gal(Em/Ha) = d2(A/M)* = 18, 
where the last identity follows from the proof of [8, Theorem 3]. Since a subgroup of 
Gal(Ew /K) generated by 9 Artin symbols has 2-rank at most 9, it follows that there exists 


a subfield L of Ey /K such that all 10 rational places of K split completely in L/K and 
Gal(L/K) ~ (Z/2Z)?. Next we need the following result. 


Lemma 2 With the notation above, the global conductor of L/K divides 2P. 


Proof: By the theory of narrow ray class extensions, P is the only possible ramified place 
in L/K. Thus, by [19, Proposition 1] it remains to show that |G2| = 1, where G; is the 
i-th ramification group of P in L/K. Letd be the different exponent and e the ramification 
index of P in L/K and let a be the least integer k > O such that |G;| = 1 for alli > k. 
From [15, Lemmas | and 2] we deduce that 


_dt+a 


e 


= 2. 


CG. 


Note that e = 2’ with O <r < 9. If r = O, then P is unramified in L/K, and so already 
|Go| = 1. If r > 1, then |Go| = |G,| = 2’ by ramification theory. If we had |G2| > 1, 
then the Hilbert different formula shows that 


AsO 2VeO at) estas: 


Also a > 3, and so 
dae. 2a 
c= > + > 
e 2" 
which is a contradiction. CO 


2, 
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Altogether, it follows that if S is the set of all 10 rational places of K and if K, is the 
(2,2P, S)-class field of K, then L C Ky. This implies that 


dyGal(K1/K) > d)Gal(L/K) = 9, 


and so the condition in Perret’s conjecture is satisfied. Hence the validity of this conjecture 
would imply that the (2,2P, S)-class field tower of K is infinite. But then Lemma | 
would yield 
2|S 10 
Ce ee ey 
22(K)+deg(2P)—-2 23 
and this contradicts the bound (3). 
This counterexample demonstrates that Perret’s conjecture is not valid in general. Conse- 


quently, Conjecture 1’ in[19] is also wrong and the lower bounds on A(q) in [19, Section II] 
remain unproved. 


5 Constructions of Curves with many Rational Points 


In this section we present a brief survey of constructions of algebraic curves over finite fields 
with many rational points. Since earlier surveys are available in [11], [26], we concentrate 
on recent construction techniques. It will again be convenient to adopt the equivalent 
viewpoint of global function fields. 

For practical applications of global function fields with many rational places it is often 
desirable to have the function fields available in explicit form, 1.e., in terms of generators 
over the rational function field and defining equations. Explicit constructions are usually 
based on special extensions such as Artin-Schreier extensions and Kummer extensions or 
on cyclotomic function fields. Note that a cyclotomic function field is the special case of a 
narrow ray class field (see Section 4) in which the base field K /IFg is the rational function 
field F(x). A discussion of such explicit constructions and some examples are given in 
[11, Section 3]. Very recent work on explicit constructions was done by van der Geer and 
van der Vlugt [27], [28]. 

A method that has recently been used to obtain global function fields with many rational 
places is based on Hilbert class fields (see Section 4 for the definition). We refer e.g. to the 
paper [7] of the authors. It is not so much the Hilbert class field Hy, itself that is suitable 
here, but one has to construct certain subextensions of H4/K in which one forces rational 
places of K to split completely. The method works particularly well if the base field K has 
a relatively small genus and a relatively large divisor class number. 

Powerful techniques for the construction of global function fields with many rational 
places can be based on the theory of narrow ray class extensions. Again, it is not so much 
the narrow ray class extension itself that is used, but rather certain specially constructed 
subfields thereof. These methods go back to the papers [8], [33], [34] of the authors. A 
general method of this type for the case where the constant field is not a prime field was 
recently introduced in [17]. This method contains earlier constructions, such as the first two 
constructions in [8], as special cases. The following Theorem 3 summarizes the method. 
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For a base field F/I, we write F, = Far - F for the constant field extension of F with full 
constant field F,,. Furthermore, i(K) denotes the divisor class number of a global function 
field K. 


Theorem 3 Let F/Fg be a global function field of genus g(F). For an integerr > 2 lett 
be a positive divisor of q’ — \ and put s = gcd(q — 1, t). Suppose that F has a place of 
degree d with gcd(d,r) = 1 and that N(F) > 1+ €g, where €g = lifd = 1 and eg = Of 
d > 2. Then for every integer n > | there exists a global function field Kn/Fgr with 


re(K,) 2 =< 3 2 DAD) 


t(q — 1)h(F) 
(g= 1)(q@" ik) d(r—1)(n—1) 
- | 22(F dn — 2 


CG NG" Ng! =): 
(q4 — 1)(g" — N@ee-) — 1) 


(< ai. ) hCFr) es ily ) (q — 1)(q@ — h(E) 
t(q — 1) h(F) t(q — 1) (q? — 1)(q@" — ACF) 
d(r—1)(n—1) 


q 


and 


sq" — DAC) ae—na—wy 
NONE a= ery (N(F) 
q — 14" — DAF) ae ACF) 
(q4 — 1)(q" — I)h(F) h(F) * 


Theorem 3 has produced a large number of new examples of global function fields K /Fyr 
with many rational places, particularly for the cases g’ = 4, 8, 9, 16, 27 (see [17]). Further 
special constructions of global function fields K /F, with many rational places using narrow 
ray class extensions can be found in [34] for g = 2, in [10] for g = 3, in [8] for g = 4, in 
[12] for g = 5, and in [14] for g = 8, 16. 


eg) 


6 Applications to Algebraic Coding Theory 


Lower bounds on the quantity A(q) in Definition 2, such as those mentioned in Section 3, 
have important applications to algebraic coding theory. Due to the celebrated work of 
V.D. Goppa, algebraic curves over F, with many rational points can be used to construct 
good linear codes over F,,. For values of g for which A (q) is larger than a known comparison 
function, Goppa’s construction of algebraic-geometry codes leads to improvements on the 
classical Gilbert-Varshamov bound for the existence of good linear codes over F,. We refer 
to the books [23] and [24] for background on coding theory and algebraic-geometry codes. 
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For a linear code C over IF, we denote by n(C), k(C), and d(C) the length, the dimension, 
and the minimum distance of C, respectively. Let U es be the set of ordered pairs (6, R) € 


[O, 1]* for which there exists an infinite sequence C), C2,... of linear codes over Fg with 
n(C;) — oo and 
_  ad(C;) », K(G;) 
6 = lim : = lim ——. 
i+>oo n(C;) i-oo n(C;) 


According to [24, Section 1.3.1], there exists a continuous function a” on [0, 1] such that 
Uj" = {(6, R):0< R <a7"(6),0 <8 < I}, 


where it is known that oi!" (0) = {3 oii(5) = 0 for 6 € [(q — 1)/q, 1], and afin decreases 


on the interval [0, (¢ — 1)/q]. The function ain is unknown, but it is an important issue in 


algebraic coding theory to obtain good lower bounds for an on the interval (0, (¢ — 1)/q). 
The Gilbert-Varshamov bound says that 


. = il 
ali"(§) > Rov (q. 8) = 1 — Hy(8) forall 8 < (0. i—*) , 
q 


where Hz, is the q-ary entropy function 
Hg (6) = é log, (q —|l)- 6 log, 6 — (1 — 4) log, (I — 8), 


with log, denoting the logarithm to the base g. Algebraic-geometry codes lead to the bound 


o"(5) > Rac(q, 4) :=1- wa —6 forall 6 € [0, 1]. 
A seminal result of Tsfasman, Vladut, and Zink [25] shows that Rag(q, 6) > Rov (q, 4) if 
q is a sufficiently large square and 6 belongs to a suitable subinterval of [0, 1]. The proof 
of this theorem is based on the fact that A(q) = q!/* — 1 if q is a square (see Section 3). It 
was open until recently whether an analogous result holds for nonsquares g. The following 
theorem of the authors [13], which is a consequence of Theorem | in Section 3, settles this 
problem for sufficiently large composite nonsquares q. 


Theorem 4 Let m > 3 be an odd integer and let r be a prime power with r > 100m? 
for oddrandr > 576m°> for even r. Then there exists an open interval (6), 62) © (0, 1) 
containing (r™ — 1)/(2r™” — 1) such that 


Rac (r",6) > Rev(r", 6) forall é € (61, 62). 


7 Applications to Low-Discrepancy Sequences 


Low-discrepancy sequences are sequences of points in an s-dimensional unit cube [0, 1}° 
that are distributed very uniformly. Such sequences were studied in the last few decades 
by number theorists and they are used, for instance, in quasi-Monte Carlo methods for 
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numerical integration over [0, 1]* (see the book [5] for a general background). The most 
powerful constructions of low-discrepancy sequences yield so-called digital (t, s)-sequences 
constructed over I, ; for the precise definition we refer to the recent survey article [9]. Here 
s is, as above, the dimension in which the sequence lives and the integer t > O is the 
quality parameter of the sequence. The value of t should be as small as possible for a good 
low-discrepancy sequence. 

Recent work of the authors has shown that global function fields over F, with many 
rational places can be used to obtain the currently best digital (t, s)-sequences constructed 
over l#,. The most general construction is that in [32] and it has the following ingredients. 
For a given q and a given dimension s > 1, let K/F, be a global function field containing 
at least one rational place P. and let D be a positive divisor of K with deg(D) = 22(K) 
and P.. not in the support of D. If P,,..., P; are s distinct places of K with P; 4 Poo for 
| <i <s, then we obtain a digital (t, s)-sequence constructed over F, with 


t= g(K) + 9 (deg(P;) — 1). 


i=] 


If at least one of the places P;, say P), is rational, then we can simply take D = 22(K)P). 
In the special case where N(K) > s + 1, we can choose all P;, 1 <i <_ s, to be rational 
places, and then by optimizing K we get 


t = Vg(s) := min{g > 0: Ng(g) > s+ Jf, (4) 


where N,(g) is as in Definition 1. 

For any qg and any dimension s > | let dg(s) be the least value of ¢ such that there exists 
a digital (t, s)-sequence constructed over F,. In Table 3 below, which is reprinted from 
[18, Table 5], we tabulate upper bounds for d,(s) for g = 2, 3,5 and 1 < s < 50. Most of 
the bounds in Table 3 are obtained from the trivial inequality d,(s) < V,(s) implied by (4) 
and from upper bounds for V, (s) that can be read off immediately from Tables 1 and 2. 


Table 1 Bounds for Ng (g) 


e\q 2 3 4 5 8 9 16 pag 


l 5 vs 9 10 14 16 25 38 
) 6 8 10 12 18 20 33 48 
3 at 10 14 16 24 28 38 56 
4 8 12 15 18 25-29 30 45-47 64-68 
5 9 12-14 17-18 20-22 29-32 32-36 49-55 55-78 
6 10 14-15 20 21-25 33-36 35-40 65 76-88 
7 10 16-17 21-22 22-27 33-39 39-43 63-70 64-98 
8 11 15-18 21-24 22-29 34-43 38-47 61-76 92-108 
9 12 19 26 26-32 45-47 40-51 72-81 82-118 
10 


13 19-2] 27-28 27-34 38-50 54-55 81-87 82-128 
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e\q 


1] 
12 
13 
14 
15 
16 
17 
18 
19 
20 


21 
22 
23 
24 
25 
26 
27 
28 
29 
30 


31 
32 
33 
34 
35 
36 
37 
38 
39 
40 


8 


48-54 
49-57 
50-61 

65 
54-68 
56-71 
61-74 
65-77 
58-80 
68-83 


72-86 
66-89 
68-92 
66-95 
66-97 
72-100 
96-103 
97-106 
97-109 
80-112 


72-115 
72-118 
92-121 
80-124 
106-127 
105-130 
121-132 
129-135 
117-138 
100-141 


112-144 
129-147 
100-150 
129-153 
144-156 
129-158 
120-161 
126-164 
130-167 
130-170 


9 


55-59 
55-63 
60—66 
56-70 
64-74 
74-78 
56-82 
46-85 
84-88 
48-9] 


82-95 
78-98 
92-101 
91-104 
64-108 
110-111 
60-114 
105-117 
104-120 
60-123 


84-127 
81-130 
78-133 
111-136 
84-139 
110-142 
120-145 
105-149 
84-152 
90-155 


84-158 

90-161 
120-164 

90-167 
112-170 
138-173 
154-177 
163-180 
168-183 
182-186 
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(Table I Contd.) 


16 


80-92 
68-97 
97-103 
97-108 
98-113 
93-118 
96-124 
93-129 
121-134 
121-140 


129-145 
129-150 
126-155 
129-161 
144-166 
150-171 
128-176 
136-181 
130-187 
144-192 


150-197 
132-202 
153-207 
156-213 
144-218 
185-223 
208-228 
152-233 
160-239 
162-244 


216-249 
162-254 
226-259 
162-264 
242-268 
243-273 
176-277 
184-282 
192-286 
200-29 | 


27 


96-138 
109-148 
136-156 

84—164 

82-171 

81-178 
128-185 

94-192 
126-199 
133-207 


163-214 
112-221 
114-228 
166-235 
196-242 
108-249 
114-256 
108-263 
114-270 
117-277 


114-284 
126-291 
220-298 
135-305 
126-312 
244-319 
162-326 
144-333 
271-340 
244-346 


153-353 
280-360 
196-367 
153-374 
171-381 
162-388 
174-395 
244-402 
268-409 
180-416 
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Table 2 Bounds for N2(g) 
g 51 52 53 54 55 56 57 58 59 
No(g) 3641 3442 40-42 42-43 3643 3844 40-45 40-45 4046 
g 60 61 62 63 64 65 66 67 68 
No(g) 40-47 40-47 4448 42-48 42-49 48-50 42-50 44-51 45-51 
g 69 70 71 72 73 74 75 76 dd 
No(g) 49-52 46-53 44-53 48-54 48-54 48-55 48-56 50-56 52-57 
g 78 79 80 81 82 83 84 85 86 
No(g) 48-57 52-58 56-59 48-59 53-60 52-60 57-61 52-62 56-62 
g 87 88 89 90 91 92 93 94 95 


No(g) 56-63 56-63 56-64 56-65 54-65 6066 56-66 56-67 65-68 


Table 3 Upper bounds for dg (s) 


q\s 1 2 3 4 5 6 7 8 9 10 It 12 13 
2 oOo o 1 1 2 3 4 5 6 8 9 10 I 
3 0 0 O 1 1 1 2 3 3 4 4 5 6 
5 0 0 0 0 0 1 1 1 1 2 2 3 3 


g\s 14 15 16 17 18 19 20 21 22 23 24 #25 = 26 


Mm BW DN 
~ 
—~ 
\O 
\O 
\O 
— 
— 
NO 
No 
— 
LoS) 
— 
LoS) 
— 
1) 
— 
1) 
— 
1) 


q\s 27 28 29 30 31 32 33 34 35 «#360 «637 «(38 


Mm GW DN 
pe 
oy) 
NO 
(>) 
NO 
>) 
NO 
pend 
NO 
— 
No 

> N 
No 
1 
NO 
Nn 
NO 
Nn 
NO 
N 
NO 
~ 
NO 
~] 


q\s 39 40 41 42 43 44 45 46 47 48 49 50 


Mm GO N 
NO 
\O 
NO 
\O 
NO 
\O 
Loe) 
Loe) 
o>) 
- 
oe) 
MN 
os) 
oA) 
>) 
eA) 
o>) 
~ 
oN 
a) 
- 
a) 
- 
a) 


References 


[1] M. Deuring, Die Typen der Multiplikatorenringe elliptischer Funktionenkérper, Abh. Math. Sem. 
Univ. Hamburg 14 (1941), 197-272. 

[2] R. Fuhrmann and Torres, F., The genus of curves over finite fields with many rational points, 
Manuscripta Math. 89 (1996), 103-106. 


Algebraic Curves Over Finite Fields 299 


[3] 
[4] 
[5] 
[6] 
[7] 
[8] 


[9] 


[10] 


[11] 


[12] 


[13] 


[14] 


[15] 


[16] 
[17] 
[18] 
[19] 
[20] 
[21] 
[22] 
[23] 


[24] 
[25] 


D.R. Hayes, A brief introduction to Drinfeld modules, The Arithmetic of Function Fields 
(D. Goss, D.R. Hayes and M.I. Rosen, eds.), pp. 1-32, W. de Gruyter, Berlin, 1992. 

Y. Ihara, Some remarks on the number of rational points of algebraic curves over finite fields, 
J. Fac. Sci. Univ. Tokyo Sect. IA Math. 28 (1981), 721-724. 

H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM, 
Philadelphia, 1992. 

H. Niederreiter and Xing, C.P., Explicit global function fields over the binary field with many 
rational places, Acta Arith. 75 (1996), 383-396. 

H. Niederreiter and Xing, C.P., Cyclotomic function fields, Hilbert class fields, and global 
function fields with many rational places, Acta Arith. 79 (1997), 59-76. 

H. Niederreiter and Xing, C.P., Drinfeld modules of rank 1 and algebraic curves with many 
rational points. II, Acta Arith. 81 (1997), 81-100. 

H. Niederreiter and Xing, C.P., The algebraic-geometry approach to low-discrepancy sequences, 
Monte Carlo and Quasi-Monte Carlo Methods 1996 (H. Niederrteiter et al., eds.), Lecture Notes 
in Statistics, vol. 127, pp. 139-160, Springer, New York, 1997. 

H. Niederreiter and Xing, C.P., Global function fields with many rational places over the ternary 
field, Acta Arith. 83 (1998), 65-86. 

H. Niederreiter and Xing, C.P., Algebraic curves over finite fields with many rational points, 
Proc. Number Theory: Diophantine, Computational and Algebraic Aspects (K. Gyory et al., 
eds.), pp. 423-443, W. de Gruyter, Berlin, 1998. 

H. Niederreiter and Xing, C.P., Global function fields with many rational places over the quinary 
field, Demonstratio Math. 30 (1997), 919-930. 

H. Niederreiter and Xing, C.P., Towers of global function fields with asymptotically many 
rational places and an improvement on the Gilbert-Varshamov bound, Math. Nachr. 195 (1998), 
171-186. 

H. Niederreiter and Xing, C.P., Algebraic curves with many rational points over finite fields of 
characteristic 2, Number Theory in Progress (K. Gyory et al., eds.), pp. 359-380, W. de Gruyter, 
Berlin, 1999. 

H. Niederreiter and Xing, C.P., Global function fields with many rational places and their applica- 
tions, Finite Fields: Theory, Applications, and Algorithms (R.C. Mullin and G.L. Mullen, eds.), 
Contemporary Math., vol. 225, pp. 87-111, American Math. Society, Providence, R.I., 1999. 
H. Niederreiter and Xing, C.P., Curve sequences with asymptotically many rational points, Proc. 
AMS Summer Research Conf. (Seattle, 1997), to appear. 

H. Niederreiter and Xing, C.P., A general method of constructing global function fields with 
many rational places, Algorithmic Number Theory (J.P. Buhler, ed.), Lecture Notes in Computer 
Science, vol. 1423, pp. 555—566, Springer, Berlin, 1998. 

H. Niederreiter and Xing, C.P., Nets, (f, s)-sequences, and algebraic geometry, Random and 
Quasi-Random Point Sets (P. Hellekalek and G. Larcher, eds.), Lecture Notes in Statistics, 
vol. 138, pp. 267-302, Springer, New York, 1998. 

M. Perret, Tours ramifiées infinies de corps de classes, J. Number Theory 38 (1991), 300-322. 
J.-P. Serre, Sur le nombre des points rationnels d’une courbe algébrique sur un corps fini, C.R. 
Acad. Sci. Paris Sér. | Math. 296 (1983), 397-402. 

J.-P. Serre, Nombres de points des courbes algébriques sur Fg, Sém. Théorie des Nombres 
1982-1983, Exp. 22, Université de Bordeaux I, Talence, 1983. 

J.-P. Serre, Rational Points on Curves over Finite Fields, Lecture Notes, Harvard University, 
1985. 

H. Stichtenoth, Algebraic Function Fields and Codes, Springer, Berlin, 1993. 

M.A. Tsfasman and Vladut, S.G., Algebraic-Geometric Codes, Kluwer, Dordrecht, 1991. 
M.A. Tsfasman, Vladut, S.G. and Zink, T., Modular curves, Shimura curves, and Goppa codes, 
better than Varshamov-Gilbert bound, Math. Nachr. 109 (1982), 21-28. 


300 


[26] 


[27] 
[28] 
[29] 
[30] 
[31] 
[32] 
[33] 
[34] 


[35] 


Harald Niederreiter and Chaoping Xing 


G. van der Geer and van der Vlugt, M., How to construct curves over finite fields with many 
points, Arithmetic Geometry (F. Catanese, ed.), pp. 169-189, Cambridge University Press, 
Cambridge, 1997. 

G. van der Geer and van der Vlugt, M., Generalized Reed-Muller codes and curves with many 
points, preprint, 1997. 

G. van der Geer and van der Vlugt, M., Constructing curves over finite fields with many points 
by solving linear equations, preprint, 1997. 

S.G. Vladut and Drinfeld, V.G., Number of points of an algebraic curve, Functional Analysis 
Appl. 17 (1983), 53-54. 

W.C. Waterhouse, Abelian varieties over finite fields, Ann. Sci. Ecole Norm. Sup. (4) 2 (1969), 
521-560. 

C.P. Xing, Maximal function fields and function fields with many rational places over finite 
fields of characteristic 2, preprint, 1997. 

C.P. Xing and Niederreiter, H., A construction of low-discrepancy sequences using global func- 
tion fields, Acta Arith. 73 (1995), 87-102. 

C.P. Xing and Niederreiter, H., Modules de Drinfeld et courbes algébriques ayant beaucoup de 
points rationnels, C.R. Acad. Sci. Paris Sér. I Math. 322 (1996), 651-654. 

C.P. Xing and Niederreiter, H., Drinfeld modules of rank 1 and algebraic curves with many 
rational points, Monatsh. Math. 127 (1999), 219-241. 

T. Zink, Degeneration of Shimura surfaces and a problem in coding theory, Fundamentals 
of Computation Theory (L. Budach, ed.), Lecture Notes in Computer Science, vol. 199, 
pp. 503-511, Springer, Berlin, 1985. 


Institute of Discrete Mathematics 
Austrian Academy of Sciences 
Sonnenfelsgasse 19 

A-1010 Vienna 

Austria 

E-Mail: niederreiter@oeaw.ac.at 


Department of Mathematics 
National University of Singapore 
2 Science Drive 2 

Singapore 117543 

E-Mail: xingcp@ math.nus.edu.sg 


A Report on Artin’s Holomorphy Conjecture 


Dipendra Prasad and C S Yogananda 


1 Introduction 


The purpose of this paper is to present a report on the current status of Artin’s holomorphy 
conjecture. For a fascinating account of how Artin was led to defining his L-series and his 
‘reciprocity law’ see [19]. 


2 Preliminaries 


2.1 Definition and Properties of Artin’s L-series 


We begin with the definition of Artin’s L-series. Let L/K be a finite normal extension of 
number fields with Galois group G. For a prime p of K and q a prime in L above p let G, 
denote the decomposition subgroup and J/g the inertia subgroup of G corresponding to q: 


G, = {g € Gle(q)=q}; I, ={g € Galg(x) =x (mod q)}. 


Let o, denote the canonical generator, the Frobenius at q, of the cyclic group Gg/Jg. 
Note that if q’ is another prime in L above p then the corresponding decomposition group, 
inertia group and Frobenius are conjugates in G of G,, J, and o,. Let N denote the norm 
from K to Q. If a group G acts on a vector space V and H is a subgroup of G then let V“ 
denote the subspace of H-fixed vectors of V, 1.e., V4 ={v € Vih(v) = v, Wh € A}. 


Definition: Suppose p : G — GL(V) is arepresentation of G on an n-dimensional com- 
plex vector space V. Following Artin one can associate an L-series to p: for 3i(s) > 1 put 


1 
TV Vi, 
Ss, p) I] det(1 — plyiq (oq) Np~) 


the product ranging over all finite primes of K. 


The product above converges for i(s) > 1 and hence defines an analytic function in 
that region. Since determinant remains invariant under conjugation, the local factors do 
not depend upon the choice of the prime q in L above p. Further, since two isomorphic 
representations have the same determinant, L(s, 0) depends only on the character x of the 
representation p so that we may as well write L(s, x) instead of L(s, 0). In fact, Artin 
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gave the following explicit expression of L(s, p) which is written purely in terms of the 
character of a representation: 


(o a) 
log L(s, x) = 2 s — Np" where X(o,) = = TAs x (Tv). 


p n=l TEO q4q 
Examples: 


1. Suppose xo is the trivial character of G. Then the Artin L-series L(s, x0) is nothing 
but the Dedekind zeta function fx (s) of K. 

2. Suppose xz is the regular representation of G. Then the Artin L-series L(s, xp) is the 
Dedekind zeta function ¢; (s) of L. More generally, if H is asubgroup of G with M 
the subfield of L fixed by H and ypr,, 1s the character of the left-regular representation 
of G on functions on G/H then L(s, xp,,) is the Dedekind zeta function €y(s) of M. 


Properties: 


1. If p), p2 are two representations with characters x;, x2 then 


L(s, x1 + x2) = Lis, x1) LS, x2). 


This enables one to define an Artin L-series for virtual characters of G. For example, 
(with L/M/K as above) €,(s)/fy(s) is the Artin L-series associated to the character 
XRy — XO of G. 

2. If H is asubgroup of G and x acharacter of H then 


L(s, Ind y) = L(s, x) 


where Ind y 1s the character of G induced from the character x of H. 
3. If H is a normal subgroup of G with A : G — G/H, the canonical map and x a 
character of the quotient group G/H then 


L(s, Infly) = Los, x) 
where Infl x is the character A o x of G. 


2.2 Artin’s Conjecture 


Conjecture 2.1 If o does not contain the trivial representation then L(s, 0) has an analytic 
continuation as a holomorphic function to the whole of the complex plane. 


A consequence of this conjecture is that for any finite extension of number fields M/K, 
Ex (s) divides Fy (s),1.€., Cy (S)/CxK (S), which is the Artin L-series of the character x r,, — Xo, 
should be a holomorphic function on C. (Here L is a Galois extension of K containing M 
with Galois group G and 4H is the subgroup of G fixing M.) 

This statement is known as Dedekind’s conjecture. In fact, Artin seems to have been 
led to his L-series while trying to prove this statement which was proved for pure cubic 
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fields by Dedekind in 1873. Artin himself was able to prove such a result when M/K 
is an Icosahedral extension (and also for some intermediate extensions therein), [1]. In 
the case of a normal extension L/K Dedekind’s conjecture was proved without assuming 
Artin’s conjecture by R. Brauer, [5] and independently by Aramata, (see 3.2 below). In 
the direction of Dedekind’s conjecture in the non-normal case Uchida, [22], and van der 
Waal, [23], (independently) have proved the following. 


Theorem 2.1 (Uchida, van der Waal). Let M/K be an extension of number fields and M 
anormal closure of M. Suppose that Gal(M/K) is solvable. Then Cy (s)/CxK (Ss) ts entire. 


3 Meromorphicity 


3.1 Artin’s Reciprocity Law 


In [2] Artin had conjectured that if G is abelian then L(s, p) is nothing but a L-series 
associated to certain characters introduced and studied by E. Hecke in 1917, [12]; Hecke 
had proved analytic continuation and functional equation for these abelian L-series. 

Hecke characters are idele class characters x : Cx —> C* while Artin’s L-series are 
associated to (characters of) representations of Gal(L/K). Thus, in order to identify his 
L-series with Hecke L-series for L/K abelian, Artin needed an identification of the abelian 
Galois group Gal(L/K) with a quotient of the idele class group such that for a prime p of 
K and qa prime in L above p the Frobenius at 0, corresponds to a uniformising parameter 
atpin Cx. In 1927 Artin proved this ‘reciprocity law’ in [3] thereby proving his conjecture 
in the case of abelian G. 


3.2 Representations of Finite Groups and Meromorphicity 


In 1930, [4] Artin proved the following general result about representations of finite groups. 


Theorem 3.1 (Artin). Every character of a finite group G is a linear combination with 
rational coefficients of induced characters from cyclic subgroups. 


This shows that an integral power of L(s, p) is a product of Hecke L-series and hence is 
meromorphic. In the case of the regular representation of G, pr, Brauer was able to prove 
the following, [5], pp. 244. 


Lemma 3.1 (Brauer). The character xr — xo of G can be expressed as a linear combi- 
nation, with positive rational coefficients with denominator |G|, of characters of G which 
are induced by nontrivial irreducible characters of cyclic subgroups. 


This implies that some integral power of €, (s)/&x (s), which is the Artin L-series for the 
character xr — xo of G, is a product of abelian (Hecke) L-series corresponding to nontrivial 
characters. Since €,(s)/Cx (s) is meromorphic and Artin’s conjecture is known to be true 
for abelian extensions we have the 


Theorem 3.2 (Aramata-Brauer). The quotient €.(s)/Cx (s) is entire. 
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In [6] R. Brauer was able to improve on Artin’s theorem; he proved the following theorem. 


Theorem 3.3 (Brauer). Every character of a finite group G is a linear combination with 
integer coefficients of characters induced from one dimensional characters of subgroups. 


This showed that L(s, p) itself is meromorphic and proved Artin’s conjecture for those 
representations which are positive integral linear combinations of representations which 
are induced from one-dimensional representations, and hence also for those representations 
which are positive rational linear combinations of representations which are induced from 
one-dimensional representations. 

In [17] Stark proved the following result. The proof is so simple and elegant that we have 
decided to include it. 


Theorem 3.4 (Stark). /f Ords—s,(¢_(s)) < 1 then for any character x of the Galois group 
G of L over K, L(s, x) is analytic at the point s = So. 


Proof: Let x be an irreducible representation of the Galois group G of L over K, and w an 
irreducible representation of a subgroup H of G. (The subgroup H will be defined later.) 
Let n, and ny, denote the order of zeros (or poles, counted with negative multiplicity) of 
L(so, x) and L(so, w) respectively. Define generalised characters, 


AG = Sony x 


X 


OH = Sony. 


Va 


We will prove that 6g 1s an irreducible character of dimension |, and therefore ny > 0, 
proving the theorem. For this purpose, let 


Indi = > aw. xx. 
X 


Since L(s, yw) = L(s, Ind@w), 
ny = Yay, X)ny. 
X 


By Frobenius reciprocity, 


xlH = > aly, x)¥. 
W 


Therefore, 


clu = ) nya. OW = do ny = on (*). 
x.W 


Now let g € G, and H the cyclic subgroup of G generated by g. Since H 1s an abelian group, 
Artin’s conjecture is true for representations of H. Factorisation of the zeta function of L as 
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the product of the L-functions attached to characters of H and the fact that Ord,,(¢1(s)) < | 
implies that 04, = w for some linear character of H. From the equation (x), this means 
that 0g(g) is a root of unity for all g € G. By Schur orthogonality, 


I 
I= Dalal’ = day. 


Therefore, exactly one n, 1s non-zero. Now take H = {1} in (*). This gives Og(1) = 1, 
implying ny - x(1) = 1, which implies n, = | and x(1) = 1. 

Using the ideas in [17], Foote and Kumar Murty, [11], were able to reprove Artin’s 
conjecture for some characters, already covered by Brauer’s theorem, without writing the 
characters as positive linear combinations of monomial characters. They do this by compar- 
ing the order of zero or pole of the L(s, x) at a fixed point, as x varies over the irreducible 
characters of G, with the order of the Dedekind zeta function ¢; (s) at the same point. They 
also proved the following C 


Theorem 3.5 (Foote-Kumar Murty). Let L/K be a Galois extension of number fields 
with soluble Galois group G and let p, < p2 <--: < Pn be the distinct prime divisors of 
|G|. If f.(s) has a zero of orderr at s = 89, wherer < p2 — 2, then L(s, x) is analytic at 
so for all irreducible characters x of G. 


In [14], Michler has explicitly described the relation between the orders of zeros of 
Dedekind zeta functions and the orders of zeros or poles of Artin L-functions at a point in 
C — {1} when the Galois group of the extension is S,,, the symmetric group of degree n. 
Suppose Gal(L/K) = S,. Denote the set of all partitions A of n by P(n). The irreducible 
characters x,, of S, are parametrised by the partitions 4x € P(n). For each A € P(n) there 
is a unique Young subgroup Y, of S,,; let L, = L” be its fixed field. The main result of 
[14] asserts that for each point so in C — {1} and each partition 4p € P(n), the order n,, (so) 
of zero or pole of the Artin L-series L(s, x,,) is uniquely determined by the orders of zeros 
r) (so) of the Dedekind zeta functions ¢,, (s) of the fixed fields L, for A € P(n). 


3.3 Functional Equation 


It follows from Brauer’s theorem (theorem 3.3) that Artin L-functions can be written as 
a quotient of product of Hecke L-functions. Since Hecke L-functions have meromorphic 
continuation to all of the complex plane, and a functional equation, so do Artin L-functions. 
In this section we state the functional equation satisfied by an Artin L-function. Before we 
can do that, we must recall the definition of Artin conductor and gamma factors associated 
to representations of the Galois group. 

For every representation V of the Galois group Gal(Q/K), there is an integral ideal 
of K, denoted by f(V) and called the Artin conductor of the representation V, which is 
divisible by exactly those primes in K which are ramified in the representation V. The Artin 
conductor has the property that f(V; ® V2) = f(Vi) f(V2) for any two representations 
VY, and V> of Gal(Q/ K). For one dimensional representations of Gal(Q/ K), the Artin 
conductor is the conductor of the corresponding Grossencharacter. There is a formula for 
Artin conductor involving the higher ramification groups for which we refer to the book of 
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Tate [20]. Here we just mention that if the representation V of Gal(Q/K) is induced from 
a representation x of Gal(Q/M), where M is an extension of K, then one has 


f(V) = D(M/K)*\ Norm (f (x)) 


where D(M/K) is the discriminant of M over K, and f(x) is the Artin conductor of the 
character x. 

Suppose that the representation V of Gal(Q/K) factors through the Galois group of a 
field extension L of K. Let w be a place of L over an Archimedian place v of K. This 
defines an element in G of order 1 or 2 which in the representation V has let’s say n‘, +1 
elgenspaces, and n,, , —1 eigenspaces. If v is a real place, define a gamma factor 


Ly(s, V) = [2818 /2)) [SPP (5 + 1)/2)]. 
If v is a complex place, define Ly(s,V) = [2- (27) -*T(s)]”". With this notation, the 
completed Artin L-function 
A(s, x) = {Idk PO Naf)!" [|] Lo. x) LG, x) 
v|oo 


satisfies the functional equation: 
ACL —s, x) = W(xX)ACs, X) 


where W(x) isanon-zero complex number of absolute value | called the Artin root-number. 


4 Two-dimensional Representations 


4.1 Dihedral, Tetrahedral and Octahedral Representations 


If p is a 2-dimensional Galois representation over a number field k then p determines a 
faithful representation of Gal(K /k) where K is the finite normal extension of k correspond- 
ing to the kernel of the representation and therefore p corresponds to a finite subgroup of 
GL(2, C). The finite subgroups of GL(2, C) have been classified by F. Klein: their image 
in PGL(2, C) 1s one of the following. 


. Cyclic. 

. Dthedral. 

. Tetrahedral (isomorphic to Aq). 
. Octahedral (isomorphic to S4). 
. Icosahedral (isomorphic to A5). 


Mm BR WN — 


Thus any finite subgroup of GL(2, C) is a cyclic central extension of one of the above 
projective groups. The inverse images of the subgroups Aq, Sq, A5 of SO(3, R) in SU(2) 
which ts the 2 fold cover of SO(3, R): 


0 — Z/2 — SU(2) — SO(3, R) — 1, 
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are Ay = SL2(Z/3), Ss = GL2(Z/3), and As = SL2(Z/5). The group Ag has 
representations of dimensions 1,1,1,3,2,2,2, of which the last 3 are non-trivial on Z/2. 
The group S4 has representations of dimensions 1,1,3,3,2,2,2,4, of which the last 3 are 
non-trivial on Z/2. The group As has representations of dimensions 1,3,3,5,4,2,2,4,6, of 
which the last 4 are non-trivial on Z/2. 

Artin’s conjecture in the cyclic and dihedral case follows from Artin reciprocity and 
Hecke’s proof that abelian L-series are entire (section 3.1). Langlands proved Artin’s 
conjecture for tetrahedral and Tunnell for octahedral representations. Thus the only case 
where Artin’s conjecture is not known over Q is the case of the icosahedral representations 
(section 5). 

Langlands has in fact generalised Artin’s conjecture to ask whether the L-function arising 
from an irreducible n-dimensional Galois representation of a number field k is in fact the 
L-function of acusp form on GL(n, k); because then by known properties of the L-function 
of automorphic representations of GL(n, k), it will in particular follow that an Artin L- 
function is entire if nm > 1. This generalisation of Artin’s conjecture which asks if an 
Artin L-function is the L-function of an automorphic L-function is called strong Artin 
conjecture. Using the theory of base change, Langlands was able to prove such a statement 
for 2 dimensional tetrahedral representations, which was then extended to Octahedral case 
by Tunnell. 


Theorem 4.1 (Langlands-Tunnell). A two dimensional representation of Gal(Q/k) with 
values inside a finite solvable subgroup of GL(2, C) comes from an automorphic represen- 
tation of GL(2, Ax). 


It follows from the ‘converse’ theory developed by Weil and Langlands that Artin’s 
conjecture for 2 dimensional Galois representations of Gal (Q/Q) implies that there is a 
natural one-to-one correspondence between equivalence classes of odd two-dimensional 
Galois representations of Gal(Q/Q) and cusp forms of weight 1 in such a way that the 
L-series associated (by Artin) to a Galois representation agrees with the L-series associated 
(by Hecke) to the corresponding cusp form. This conjectural correspondence is to be viewed 
as a natural generalisation of the equivalence, coming from class field theory, of characters 
of the Galois group (of an abelian extension) and Hecke characters (Artin’s reciprocity). 
These correspondences are special cases of the Langlands’ programme, see [15]. Here is 
the theorem of Weil and Langlands. 


Theorem 4.2 (Langlands-Weil). Let p be an odd 2-dimensional Galois representation 
over Q with conductor N and determinant €. If L(s, p ® i) is an entire function for all 
twists p ® X of p by one-dimensional representations h, then there is a cusp form f of 
weight 1 on T'\(N) with nebentypus character € such that L f(s) = L(s, p). 


The following theorem due to Deligne and Serre constructs Galois representations asso- 
ciated to cusp forms of weight 1. 


Theorem 4.3 (Deligne-Serre). /f f is a newform of type (1, €, N) then there is an odd 
two-dimensional Galois representation p : Gal(Q/Q) — GL2(C) with conductor N and 
determinant € such that L ¢(s) = L(s, p). 
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4.2 Two Dimensional Representations of Prime Conductor 


In this section which has been completely taken from the paper of Serre [16], we summarise 
the classification of two dimensional representations p of Gal(Q/Q) of prime conductor. 
We will say that p is dihedral, or Aq, S4, As depending on the image of p in PGL(2, C). 
We will denote the determinant of p by e. 


Dihedral case: Two dimensional dihedral representations p of Gal(Q/ Q) of prime con- 
ductor p arises only for p = 3 mod 4 and Is then induced from a non-trivial unramified 
character of K = Q(./— p). Such Galois representations indeed correspond to a cusp form 
on I‘; (p) as is well-known from Hecke theory. 


Non-dihedral case: Suppose that p is a two dimensional non-dihedral representation of 
Gal(Q/Q) of prime conductor p. Then, 


(1) p #1 mod 8. 
(2) If p =5 mod 8, p is of type S4, and the character € of order 4 and conductor p. 
(3) If p = 3 mod 4, p is of type Sq or As, and € is the Legendre symbol n —> )- 


Conversely, start with a Galois extension L of Q and a prime number p. Consider the 
following 3 cases. 


(1) Gal(L/Q) = S4 and p = 5 mod 8. 
(ii) Gal(L/Q) = Sq and p = 3 mod 4. 
Gil) Gal(L/Q) = As and p = 3 mod 4. 


An embedding of Gal(L/Q) in PGL(2, C) determines a projective representation py of 
Gal(Q/Q). There exists a lifting of po, to GL(2, C) with prime conductor p, if and only if 
one has the following in the 3 respective cases above: 


(i) L is the normal closure of a non-real quartic field of discriminant p>. 
(ii) L is the normal closure of a quartic field of discriminant — p. 
(i111) L is the normal closure of a non-real quintic field of discriminant p?. 


4.3 Tate’s Example 


Till 1976 Artin’s conjecture was known to be true only for representations which are pos- 
itive rational linear combinations of one-dimensional representations in which cases some 
integral power of the corresponding Artin L-functions are product of Hecke L-functions. In 
1976, in [19] Tate reported on the construction of a tetrahedral representation of conductor 
N = 133 thereby inaugurating the experimental verification of Artin’s conjecture. They 
were able to prove the existence of the corresponding newform of weight | and level 133 
predicted by the Langlands’ conjecture (“by relatively easy hand computation’). By the 
theorem of Deligne-Serre (theorem 4.3 above), this produced the first example of an Artin 
L-series which is known to be holomorphic in spite of the fact that no power of it is a 
product of abelian L-series. 

The strategy used by Tate is the following: Given a 2-dimensional representation p of the 
appropriate kind, compute the coefficients a, of the Artin L-series L(s, 0) forn < A, say. 
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Then look for a modular form of weight 1 of level 133 with Fourier coefficients a, for 
n <A. If A is sufficiently large, for instance A > (N/12) Hw + p—'), then this 
form is uniquely determined, if it exists. Now invoke the theorem of Deligne-Serre to get 
a representation p; corresponding to this form for which the Artin’s conjecture is true. 


4.4 Dimension of Spaces of New Forms of Weight 1 


Let S}““(N, €) denote the complex vector space of newforms of weight 1, level N and 
character €, and letd(N, €) be the number of equivalence classes of 2-dimensional, complex, 
irreducible, continuous, odd Galois representations with conductor N and determinant e. 
The theorem of Deligne-Serre implies dim S|““(N, €) < d(N, €). Artin’s conjecture is the 
statement that 

dim S)““(N, €) = d(N, €). 


Thus one way to verify Artin’s conjecture in a particular case is by computing the numbers 
dim Sy°“(N, €) and d(N, €). But there is no known method of computing the dimension 
of the space of forms of weight 1. Also, there is no general way of determining the number 
of equivalence classes of 2-dimensional, complex, irreducible, continuous, odd Galois 
representations with conductor N and determinant e. 

Suppose N = g, a prime with g = 3(mod 4). Hecke proved that 


fy — \- x (ayer IN(@z 
a 


where x is an arbitrary non-trivial character on the ideal class group of Q(,/—q) and a 
runs over all non-zero integral ideals in Q(./—q), 1s a cusp form of weight 1, level g and 
character cc): Since g = 3 mod 4, the class number of Q(./—q) is odd. It is easy to see 


that f, = f, if and only if x = x’, or x = x’—!. It follows that if the class number of 
Q(./—@) is h, then there are at least (h — 1)/2 linearly independent cusp forms of weight 1, 
level g and character (~). Therefore Siegel’s theorem estimating the order of the class 
group implies the ineffective lower bound 


dimS, («. (=)) >>. gas 
q 


for alle > 0. But there are also forms which are not of the Hecke type. However, forms of 
non-Hecke type are rare and it is conjectured that 


dimS}j (« (<)) = =(h =) ORG), 
7) 2 


In [9] W. Duke has proved that 


dimS} (« (<)) << g'/)? log g, 
q 


with an absolute implied constant. 
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5 Icosahedral Representations 


This section is devoted to the last unsolved case of the Artin’s conjecture for 2-dimensional 
representations, the case of the icosahedral representation. Following the method of Tate, 
J. Buhler in 1978 constructed for the first time a 2-dimensional icosahedral representation 
satisfying Artin’s conjecture, [7] (see 5.1). About I5 years later, in 1993 Frey and his 
students discovered seven more examples, [10]. All these examples are for icosahedral 
representations over Q. 


5.1 Verification of Artin’s Conjecture 


The aim of this section is to outline the method followed by J. Buhler [7] who gave the 
first example of an icosahedral representation for which Artin’s conjecture is true. The 
construction of these examples involve extensive computation on a computer. 

The starting point is the construction of icosahedral representations with low conduc- 
tors; the lowest conductor found was 800 = 2°57. This was done by sieving through a 
large number of quintic polynomials, eliminating first those whose discriminants were not 
squares, then reducible polynomials and then eliminating those whose Galois groups were 
proper subgroups of A5. For each surviving quintic the ring of integers 1s computed to 
determine the behavior of ramified primes and the minimal conductor. The quintic which 
gives rise to the icosahedral representation of conductor 800 is: 


F(x) = x°> + 10x? — 10x? + 35x — 18. 


Denoting by p the icosahedral representation of conductor 800 associated to F(x), the 
next step is the calculation of the L(s, 0) = )\a,n~*. This calculation involves computa- 
tions in a sextic extension of Q. As a sample we reproduce a few values of a,,; note that a,, 
is O if n is not prime to 10. Let i denote a fixed fourth root of 1 and 7 the positive root of 
x*—~x—1=0. 


a => ef ai. <1 OG 4G 'O: Peay 


A cusp form of weight 1 is said to be dihedral if the corresponding representation (via 
Deligne-Serre) is an irreducible properly induced representation and that the form is tetra- 
hedral (resp. octahedral, icosahedral) if the representation is tetrahedral (resp. octahedral, 
icosahedral). 

Consider the space V of modular forms of type (1, €, 800) (i.e., of weight 1, character € 
and level dividing 800) for a carefully chosen Dirichlet character €. The choice of € is such 
that in V: 


e there are two non-cuspidal eigenforms of level 100, denoted by g1, g2; and 
e there is only one dihedral form of level 100, 23 (note that g3 is a cusp form). 


Each of these forms can ‘pushed-up’ to level 800 by the Atkin-Lehner operators, By, 
d = 1, 2,4, 8 (g|Ba(z) = g(dz)). Let 


Sid = gil|Bu. 


A Report on Artin’s Holomorphy Conjecture 311 


If g is a modular form of type (1, €, N) let g denote the ‘complex conjugate’ of g; g is of 
type (1, €, N) and the Fourier coefficients of g are the complex conjugates of the Fourier 
coefficients of g. 

For z in the upper half plane, let f(z) = >) 51 Gn e77'N2 where a, are as above (L(s, p) = 
Yann *). The aim is to show that this f coincides with the q-expansion of a cusp form 
of weight | up to a required number of terms. This is done as follows. 

By searching in the space of cusp forms of weight 2 and level 800 (which has dimension 
97) Buhler discovered the 


Fact: For eachi = 1,2,3 andd = 1, 2, 4, 8 there is a modular form h;.g of weight 2 and 
level 800 (and trivial character) such that 


fia =Ni.a (mod ge): 


Here (mod g™) means that the first M terms of the two power series agree. 


From this it follows that 
hid8 ja ="j.a'8id 
for all i, 7, d, d’. On each side of this congruence is a modular form of weight 3, level 800 
and a certain character. Such modular forms are sections of a bundle on X9(800) whose 
degree is 344. Therefore the above congruence is actually an equality and we have that 


» hia 


Sid 


is independent of the choice of i, d. It turns out that f’ is a cusp form of type (1, €, 800). 
It is to prove that it is so that one uses the representation of f’ in so many ways as above 
by showing that the forms g; , have no common zero in the upper half-plane. 

That f’ is not of the dihedral type is proved by looking at the eigenvalue of f’ for the 
Hecke operator 73; and that it is not of the tetrahedral or octahedral type follows from: 


e There are no cyclic extensions of Q of degree 3 unramified outside 2 and 5 and hence 
there are no Aq extensions of Q unramified outside 2 and 5. 

e There are exactly three S4 extensions of Q unramified outside 2 and 5 and the corre- 
sponding representations have conductors not dividing 800. 


Therefore it follows that there is an icosahedral form in V! If o; denotes the corresponding 
representation (via Deligne-Serre) then 9, is an icosahedral representation which satisfies 
the Artin’s conjecture. It is not known if p; is same as the representation p that we started 
with. Buhler ({7], pp. 90) proves that if f’ is an eigenform for 7); as well then it is so. 


5.2 Recent Developments 


Like the Deligne-Serre theorem which was stated as theorem 4.3 which constructs Galois 
representations associated to modular forms of weight | such that the L-function associated 
to the Galois representation and the modular form is the same, there is a more classical 
theorem due to Deligne which constructs €-adic Galois representations for modular forms 
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of any weightk > 2. These Galois representations can be reduced modulo the maximal ideal 
in an appropriate ring of £-adic integers to construct Galois representations with values in 
GL(2) of a finite field. There has been much activity in recent years motivated by aconjecture 
of Serre which says that any irreducible two dimensional Galois representation of Gal (Q/ Q) 
with values in a finite field which is irreducible and is odd in the sense that the determinant 
of complex conjugation is — 1, comes via this construction for some modular form. 

A representation of Gal(Q/Q) with values in GL(2, C) in fact lies in GL(2, ©) where 
© is the ring of integers of a number field. By going modulo prime ideals in ©, an Artin 
representation therefore gives rise to representations of Gal (Q/Q) with values in GL(2) of 
a finite field. Kevin Buzzard and Richard Taylor have in [8] proved that if this representation 
is modular in the sense that it comes from a modular form via Deligne’s construction, then 
under a mild hypothesis, Artin’s conjecture is true for the original representation. 

We note the following result, cf. theorem 1.2 in [18], which relates elliptic curves 
to mod 5 Galois representations from which the modularity of certain representations of 
Gal(Q/Q) with values in GL2(Z/5) follows from the work of Wiles, Taylor-Wiles, and 
Diamond which proves modularity of certain elliptic curves over Q. 


Theorem 5.1 [fp : Gal(Q/Q) — GL2(Z/5) is any representation whose determinant ts 
the cyclotomic character mod 5, then there exists an elliptic curve over Q which realises p 
on its 5 torsion points. 


However, for the purposes of Artin’s conjecture, modularity of a representation of 
Gal(Q/Q) with values in GL2(Z/5) is not adequate as it is easy to see that the reduc- 
tion modulo a prime ideal of an Artin representation of As5-type taking values in GL2(Q) 
cannot lie in GL2(Z/5) (with determinant the cyclotomic character). 

In conclusion, one can Say that there is now a general approach to Artin’s conjecture for 2 
dimensional representations of Gal(Q/Q) which in the absence of knowing enough mod p 
representations of Gal(Q/Q) to be modular, has not yet been successful in completing 
Artin’s conjecture for 2 dimensional representations of Gal(Q/Q). 

In arecent work, C. Khare has proved that Serre’s conjecture implies Artin’s conjecture 
for 2 dimensional odd representations of Gal(Q/Q), cf. [13]. 


6 Stark’s Conjecture 


In this section we state an important conjecture of Stark which does not depend on Artin’s 
conjecture but uses Artin L-functions. There is a lot of importance attached to Stark’s 
conjecture especially as it is a contribution to Hilbert’s 12th problem about generation of 
classfields of number fields in terms of explicit values of transcendental functions just as was 
done for Q and quadratic imaginary fields by attaching values of exponential and elliptic 
functions respectively. 

If L is anumber field with r; real embeddings and r2 pairs of complex embeddings, then 
it is known that the zeta function of LZ has a zero of order r} + r2 — 1 at the origin, and the 
leading term of the Taylor expansion of €,(s) at s = O begins as 


hR 
FO eer a i 
é 


where h is the class number of L, R the regulator, e the number of roots of unity in L. 
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Stark has defined an analogue of the regulator of a number field for all Artin 
representations, now called Stark regulator and has conjectured that just like the Dedekind 
zeta function, the leading term of the Taylor expansion of L(s, x) at s = 0 divided by the 
Stark regulator is an algebraic number belonging to the cyclotomic field in which the values 
of the character of x lie. We refer to the book of Tate [20] for an exposition of Stark’s 
conjecture which remains open. We note that from proposition 3.4 of [20], the order of 
vanishing of L(s, x) ats = 01s r(x) which is 


r(x) = Y (dim Vow) — dim VV", 


VECO 


where V is the vector space on which the representation x is defined, and G,, denotes the 
decomposition group at a place w of L over an infinite place v of K. It follows from this 
formula for r(x) that if L is abelian over K and y is a faithful character, then r(y) = 1 
if and only if the decomposition group Is trivial at exactly one infinite place of K. In this 
case, the Stark regulator is the determinant of a | x | matrix, and unwinding the definitions, 
it follows from Stark’s conjecture that if L is an abelian extension of a number field K, and 
x is a character of the Galois group of L over K, then 


l 
L'(0, x) = —— ) | x(@) log |e" |, 
Oo 


for a suitable element € in L, where e denotes the number of roots of unity in L; the absolute 
value is taken with respect to a fixed place of L which lies over the unique Archimedean 
place of K where the decomposition group is trivial. The element € of L is called the Stark 
unit and has the property that the extension of L obtained by attaching the e-th root of € is 
abelian over K. If x 1s a faithful character of Gal(L/K), € generates L over K. This case 
of Stark’s conjecture dealing with abelian extensions is also open except when K = Q, 
or an imaginary quadratic field. The case K = Q being well-known, and the imaginary 
quadratic case due to Stark where the Stark unit is an elliptic unit. 
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Siegel’s Main Theorem for Quadratic Forms 


S. Raghavan 


§ 1 


A classical question in the Theory of Numbers is one of expressing a positive integer as 
a sum of squares of integers. The qualitative aspects of this problem require at times no 
more than rudimentary congruence considerations e.g. a prime number leaving remainder 
3 on division by 4 cannot be a sum of two squares of integers; however, in general, subtle 
arguments are called for. Fermat’s Principle of Descent needs to come into play for a 
proof of the (Euler-Fermat-) Lagrange theorem that every positive integer is a sum of four 
squares of integers. Skillful use of elliptic theta functions was made by Jacobi to obtain 
a quantitative refinement of that assertion, viz. according as n is an odd or even natural 
number, the number of ways of expressing n as a sum of four squares of integers is 80 *(n) 
or 240*(n), where o* (t) for any natural number ¢ is the sum of all the odd natural numbers 
dividing t; Jacobi’s famous identity linking of with other theta constants 62, 64 and their 
derivatives is an analytic encapsulation of all these formulae for varying n. An analytic 
formulation of similar nature arises also as a special case of the Siegel formula (extended 
suitably to cover the boundary case of quaternary quadratic forms as well) which connects 
theta series associated with quadratic forms to Eisenstein series: for complex z with positive 
imaginary part, 


4 
Y > exp(zri n*z) = I+ a, YP — gz)? 


neZ qeEN pez 
(p,g) =1, p+q = |(mod 2) 


where the sum over p,q giving the Eisenstein series on the right hand side is only condi- 
tionally convergent and can be realized from an absolutely convergent Eisenstein series by 
analytic continuation via Hecke’s Grenzprozess (The inner sum over p is over all integers 
which are coprime to qg and are of opposite parity to q). 


§ 2 
More generally, let us consider a positive definite quadratic form f(x,...,%m) (= 
2 L<,.j<m SiiXiX js in m variables x1, ..., Xm With the associated symmetric positive definite 


integral (coefficient) matrix S := (s;;) and then the number r(f; t) = r(S; t) of ways of 
representing an integer ft as f(a), ..., @m) with integers a|,..., Gm» (instead of representing 
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t merely as the sum ay +.--+4 a>, of m squares of integers a), ..., @m). We know from 
Minknowski’s reduction theory that there are only finitely many integral positive definite 
quadratic forms, say f; = f, fo,..., f, such that any integral positive definite form g 
which is equivalent to f over the ring Z, of p-adic integers for every prime p is equivalent 
to one of f},..., f, over Z while no two distinct ones among fi,..., f, are equivalent 
over Z. In other words, the genus of the positive definite integral quadratic form f splits 
into h different classes each containing precisely one of f),..., f,. (We recall here that 
for two quadratic forms g1, g2 with coefficients in a commutative ring R with unit element, 
g 1 1s Said to represent gz over R if there exists a linear transformation of the variables with 
coefficients from R taking g; to g2 and moreover g, is called equivalent to g2 over R (or 
said to be in the same R equivalence class as g2) if g, and g2 represent each other mutually 
over R. The number r(f; ft) above is thus the number of representations of the quadratic 
form ty? by f(x1,.-.,Xm) over Z). Analogously, let for a power p* of any given prime 
number p,r(f,t; p*) denote the number of representations of t by f over the quotient 
ring Z/p*Z. Then d,(f, t), the p-adic density of representations of t by f is defined as 
Cm liMso0 pw" r(f, t; p®) with cm i= 2 or 1 according asm = 1 or m > 1; this 
p-adic density d,(f, t) is clearly non-negative and indeed it is a rational number which 
equals O precisely when f fails to represent tf over the p-adic ring Z,. The infinite product 
I] p apf, t) extended over all the prime numbers p converges and is equal to 0 exactly 
when f fails to represent tf over Z, for at least one prime i.e. when at least one d,(f, f) is 
0. The real density dy.(f, t) for measuring the representability of t by f over the field R of 
real numbers is defined as limy vol (f~!(U)/vol U), where the limit is taken over mea- 
surable neighbourhoods U of t shrinking to {t} with vol(e) denoting Lebesgue measure in 
the respective spaces; it is known [17] that doo (f, t) = ge /2¢(m—2)/2 11 (mm /2) (det f)'/*}, 
where I is Euler’s gamma function and det f is just the determinant of f. Finally let, for 
1 <i <h, e; denote the number of linear transformations over Z which preserve the form. 
We can now state Siegel’s main theorem for positive definite integral quadratic forms f (in 
this special case, for t € N): 


l 
> rh. d/e / Y> Vei = Tyg, doth] [anh (*) 
m, p 


l<,<p l<j <p 


with 6,2 denoting the Kronecker delta function. 

For f to represent t over Z, it is clearly necessary that f represents ¢ over R and over 
the p-adic ring Z, for every prime p. But even all the latter infinitely many conditions 
put together can not ensure that f represents t over Z although by the Hasse principle, the 
representation of t by f over Q is then assured. Siegel’s main theorem guarantees in such 
an event that at least one of f), ..., f, represents t over Z (even if f does not do so) and is 
thus a far more subtle refinement and a quantitative one too. The remarkable string of papers 
({17], [18], [19]) by Siegel deal with the general situation of quadratic forms f representing 
quadratic forms g (instead of numbers f), again not necessarily over the ring Z but over the 
ring of algebraic integers in a totally real algebraic number field and furthermore with f 
not having to be a definite quadratic form. 
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§ 3 


For m > 4, Siegel reformulated the main theorem stated in (*) as an analytic identity 
between theta series associated with f),..., fj, and Eisenstein series: 


2 O( fj, 2)/e; / > lfejg ult Hf, b, (bz — ay"? (x) 


l<j< l<;< (a,b) = 1 
a id ae Gs 0 
where, for complex z with positive imaginary part, the theta-series 


O( fj, 2) = \- exp(ziz fj(ai,---,4m)), 


Qyy..5,Am EZ 
H(f,b,a) = (/ — 1/by"?(det S'S expla f (ai, ..-,4m)/b) 


are generalized Gaussian sums and the summation over a, b in the Eisenstein series is 
carried out over all coprime integer pairs a, b with b > O and the accent requires ab to 
be even whenever f represents some odd integer over Z. The Eisenstein series on the 
right hand side of the analytic identity (**) converges absolutely form > 4. For m = 4, 
one invokes Hecke’s limiting process, to suitably define the Eisenstein series via analytic 
continuation. The significance of the identity (**) can be fully realized only when we notice 
that it is analytically impossible perhaps to distinguish between the various theta series 
0(f;, Z) individually in view of their exhibiting the same kind of behaviour as the variable 
z approaches the ‘rational points’ a/b or infinity (so that fori # j, 6(fi, z) -9(fj, 2) tends 
to Oas z tends to a/b or oc); thus the individual theta series seems to resist being expressible 
in terms of explicit objects like Eisenstein series. It is on the other hand remarkable that 
the arithmetical main theorem correctly points to the right weighted mean of theta series 
0( fj, 2) being expressible as an Eisenstein series! 


§4 


For considering the problem of representing forms g in n variables by the given form 
f (instead of merely representing numbers by f) over Z, the complex variable z above is to be 
replaced by a point Z of the Siegel upper half plane H,, := {Z =‘Z € MN, (Z)| 574 (Z-Z) 
is positive definite} and the associated theta series 0(S, Z) := )°¢ exp(zitr(‘(GSGZ)), 
where now G runs over all (m,n) integral matrices and tr denotes matrix trace. The 
general term in the relevant Eisenstein series that features in an analogue of (**) takes 
the form H(S, C, D)(det(C Z + D))~™/*, where H(S, C, D) are generalized Gauss sums 
and the summation in the Eisenstein series is over all n-rowed coprime symmetric pairs 
(C, D)i.e. with (n, n) integral matrices C, D satisfying the conditions that (i) C'D = D'C 
and (11) GC, GD integral for any (n,n) rational matrix imply that G is integral. Such 
n-rowed coprime symmetric pairs C, D make up precisely the last n rows of elements 
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A B ; A B 
MS (4 D ) of the Siegel modular T,, := | M = © ~ E Mon(Z)IA, B,C,De 
M,(Z), A’'B = B'A,C'D = D'C and A'D — B'C = En, the n-rowed identity matrix 


in My (O)}. The Siegel modular group [, acts on Hl, via the modular transformations 


Z > (AZ+ B\(CZ+ D)"'! for M = ( a) e€ T,,. Under modular transforma- 
tions, the behaviour of the Eisenstein series on the right hand side of an analogue of (**) 


is quite similar to that of the h theta series 0(S;, Z). For n = 1, as remarked earlier, 
0(Sj,z) — 9(S;, Zz) is a so-called cusp form for an appropriate subgroup of the elliptic 
modular group I"); this is no longer true, in general, forn > 1. Forn = 1 this phenomenon 
however leads immediately to an asymptotic formula for r(S; ft) with the main term in 
the asymptotic formula being given by the Fourier coefficient corresponding to the index 
t in the Fourier expansion of the Eisenstein series on the right hand side of the identity 
(*) above. 


§ 5 


In the proof of Siegel’s main theorem (*) and more general formulations covering the 
representation of forms by f or the case of f being indefinite as well or when Z is replaced 
by rings of algebraic integers, a crucial step eventually is to show that a certain number 
p(S) equals 2 where, for S positive, 


p(S) = det Sy"? 1 YS ie; [| 2 ??rG/2) 
L=3 <j, I<,)<m 


lim 20D gmt D/2 7¢,(S) 


q->0o 


where the limit is taken over natural numbers q tending to infinity, through the sequence 
of factorials n!,@(q) is the number of prime factors of q and e,(S) is the number of 
linear transformations preserving the quadratic form f associated with S over the quotient 
ring Z/qZ. A group-theoretic interpretation of this definition of o(S) by bringing in the 
orthogonal group G of f (defined over Q) and choosing the rings Z, Z, for all primes p 
and IR for the base domain, makes the above crucial step equivalent to proving that p(S) 
is just the (Weil-) Tamagawa number t(G) for the orthogonal group G corresponding to 
f; it was actually shown by Tamagawa that t(G) = 2. By considering the ‘adelic zeta 
function’ attached to orthogonal groups of quadratic forms f and its residues at poles (aptly 
generalizing Siegel’s zeta functions associated with f when indefinite too), Weil proved 
in [25], for orthogonal groups G of (non-degenerate) quadratic forms inm > 3 variables, by 
induction on m, the “Siegel-Tamagawa theorem that t(G) = 2” which “by purely formal 
calculations” might be seen to be equivalent to “Siegel’s main theorem” (on representation 
by quadratic forms); Weil’s generalization of the Siegel formula for classical groups [27] 
yields again that t(G) is the same for all m > 4. 
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§ 6 


A nice and surprising proof for Siegel’s main theorem for representation of integral quadratic 
forms (or equivalently (n,n) symmetric matrices 7) by unimodular positive definite even 
quadratic forms f(x1,..., Xm) form > 2n+2 has been given by Andrianov ({1], [4, Ch. IV, 
§6]), by explicit determination of the effect of Hecke operators on theta series attached to 
such f and invoking properties of Eisenstein series. If S is the integral m-rowed symmetric 
(m,m) matrix associated with any such given form f, then S has even positive inte- 
gers as diagonal entries and is of determinant 1; it is well-known that m is then nec- 
essarily a multiple of 8. Let S;(= S), S2,..., Sy, be the matrices corresponding to the 
representatives f), fo,..., fy of the h different (Z-equivalence) classes in the genus of 
f. Let r(S;, 7) be the number of integral (m,n) matrices G such that ‘GS;G = T. 
Then with notation as in §3, we have for Z € Hy, and any S;(1 < i < h),6(S;, Z) = 
Yor r(S;, T) exp(wi tr(T Z)), where the summation in the Fourier expansion on the right 
hand side is over all (n, n) symmetric non-negative definite even integral matrices T and 
i : ) e Tn, we have 6(S;; Z)|M := 
A(S:,(AZ + B)(CZ + D)~') det (CZ + D)~™/? equals 6(S;, Z) and thus 6(S;, Z) is a 
Siegel modular form of degree n and weight m/2. For any given prime number p, the 


double coset Fal ; ‘ : E \rn of (2n, 2n) matrices is a finite union of left cosets I, Nj; 
n 


Aj B; 
Cj Dj 
of degree n and weight m/2 is given by 


tr denotes matrix trace as before. For any M = ( 


with Nj = ( ), The Hecke operator T(p) acting on Siegel modular forms f(Z) 


f(Z)|T(p) = po" MP S* f(AjZ + Bi (CZ + Dj)! x (det CjZ + Dj)”. 
J 


Andrianov determined explicitly the effect of the Hecke operator T(p) on 6(S;; Z) and 
showed that the genus invariant F(S, Z) := ) oj <j<p e'0(S;: Z)/ d-\<i<n 1/ei which is 
the weighted mean of 6(S;; Z) generalizing the left hand side of the identity (**) in §3 is 
actually an eigen form of the Hecke operator T(p) for every prime p; the constant term in 
the Fourier expansion of F'(S, Z) is clearly equal to 1. On the other hand, it 1s known that 
any such Siegel modular form (i.e. of degree n and weight m/2) which has constant term | 
and is further an eigen form of an infinite number of Hecke operators T (p) has to coincide, 
for m/2 > n+ 1, with the Eisenstein series Em /2(Z) = > c.p det (CZ + D)~™/2, 
the summation being over a complete set of n-rowed coprime symmetric pairs (C, D) 
such that no two distinct ones among them, say (C;, D)), (C2, D2) are related by the 
condition C} = UC2, Dj = U D2 with some integral (n,n) matrix U of determinant +1. 
The condition m/2 > n+ 1 ensures the absolute convergence of this series. Its Fourier 
coefficients are well-known from Siegel’s fundamental paper [20]; comparison of Fourier 
coefficients on both sides of the relation F(S, Z) = Ej, /2(Z) yields Siegel’s main theorem 
on representation of (n, n) positive definite integral matrices T by an even (m, m) positive 
definite integral S of determinant 1 under the condition m > 2n + 2. These restrictions 
on S may seem to be rather severe since the (arithmetical) main theorem of Siegel is true 
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without such ‘stringent’ conditions; still the function theoretic proof above for Siegel’s main 
theorem even for a special case of positive definite S does represent a spin off for Arithmetic, 
especially when, as remarked earlier, it seems impossible to distinguish between the various 
0(S;, Z) analytically. Let us also recall that, in the other direction, Siegel’s main theorem for 
quadratic forms and the resultant analytic identity (linking theta series with totally disparate 
objects such as Eisenstein series) was the starting point for Siegel’s full fledged analytic 
theory of modular functions of degree n. Siegel modular functions of degree n have the 
same significance for algebraic function fields of genus n as the usual (elliptic) modular 
functions for function fields of genus 1; they represent the contribution from Arithmetic 
towards the study of an important problem of the theory of algebraic functions. Of course, 
for the development of the theory of authomorphic forms on semi simple Lie groups, the 
theory of Siegel modular forms has been the driving force and the touchstone, as well. 


§ 7 


A Siegel modular form of degree n and weight k is aholomorphic (complex-valued) function 


f on H], such that for every M = ( | - ) EPn, f(AZ + B)\(CZ + D)~') det (CZ + 


D)~* = f(Z) and further f is regular at infinity forn = 1. Givena Siegel modular form 
f of degree n and weight k, the Siegel operator ¢ associates a Siegel modular form ¢f 
of degree n — | and weight k by the prescription (@f)(Z1) := lims—+oo ral - . ) for 
Z, € H,-1. Cusp forms of degree n are those belonging to the kernel of @. If k is large 
enough in relation to n, the Siegel operator is surjective as shown by Maass [12] with the 
use of his Poincaré series and by Klingen [10] through his Eisenstein series ‘lifting’ cusp 
forms g of degree j < n to Siegel modular forms of degree n. The latter Eisenstein series 
of degree n are series G(Z, g) of the form >> g(7(AZ+ B)(CZ + D)~') det (CZ+ D)~* 
with g being any given cusp form of degree 7 < n, 1(W) represents the principal (J, /) 


A B 
minor for W € H,,, and the summation over M = ( CD ) runs through representatives of 


I’, modulo an appropriate subgroup. This series converges absolutely under the condition 
thatk > n+ +1 and is mapped to the cusp form g of degree 7 under the iterated operator 
g@"-/. For k < 2n, one needs to insert in the general summand a Hecke convergence factor 
(en with M(Z) := (AZ+ B)(CZ+ D)~!, Im(W) := sya (W —W) and 
a complex parameter s, while real part of s being large clearly ensures absolute convergence 
of the series; thus the analytic continuation of the Eisenstein series with convergence factors 
as a function of s has to be studied. For n = 1, the first such (vector-valued) Dirichlet series 
in § arising from Eisenstein series (with Hecke convergence factors) associated to Jacobi 
theta constants 6), 62, 63 or theta series attached to even quadratic forms (of given signature) 
were thoroughly investigated by Siegel [22, 23] as regards their continuation as a function 
of s and their functional equation involving s. For general n, the analytic continuation and 
functional equation for similar Eisenstein series with suitable Hecke convergence factors 
are subsumed in the general framework occurring in Langlands’ theory of Eisenstein series 
on semi simple Lie groups [11]. 
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It is not clear, on the face of it, if, even for a simple looking Eisenstein series E NG (Z,S) 
with Hecke convergence factors defined by E(Z ,5) := (det Im(Z))* ec. p det 


(CZ + D)*| det (CZ + by\** convergent absolutely for k > n+ 1 and complex s 
with Re(s) > 0, we have, in the ‘boundary case’ of k = n+ 1 € 2N, a holomorphic func- 
tion of Z as limit when s tends to O say, through positive real values. Forn = 1 and k = 2, 
we know Hecke [7] that the limit as s tends to 0 exists but is not holomorphic as a function of 
Z; the first example for higher degrees n, of the Hecke limiting process yielding a non-zero 
holomorphic modular form of degree n is for the case n = 3 and k = 4 and may be found 
in [13]. For general n > 1 with the ‘boundary case’ k = n+ 1 € 2N, we know from the 
comprehensive researches of Weissauer [28] and results obtained independently by Shimura 
[16], when precisely the limiting process yields, holomorphic modular forms; Weissauer 
shows in particular that Hecke’s Grenzprozess yields non-zero holomorphic modular forms 
for (even) k > (n+ 3)/2 or if Alk with k < (n+ 1)/2. Weissauer [28] also determines the 
obstruction to ¢-lifting i.e. lifting a cusp form g of degree j and weight k to a holomorphic 
limit of series resembling Eisenstein series G(Z, g) with appropriate Hecke convergence 
factors in the case of ‘boundary weights’ k. The study of Eisenstein series and results on 
g-lifting are moreover applied in [28] to obtain, in particular, theorems on representation of 
Siegel modular forms as linear combinations of theta series 9(S, Z) as above, generalizing 
Bocherer’s notable results [2]. Actually Bocherer proved that every Siegel modular form 
(respectively cusp form) of degree n and weight k with 4\k and k > 2n is a linear combina- 
tion of theta series (respectively with spherical harmonics) associated to even unimodular 
positive definite quadratic forms, by first determining the Fourier expansion of Klingen’s 
Eisenstein series G(Z, g) and then applying important results of Garrett [6], difficult work 
of Andrianov [1] on Euler products and Siegel’s main theorem for quadratic forms. The 
‘basis problem’ for elliptic modular forms (not necessarily of level 1) is one of express- 
ing them as linear combinations of theta series associated with quadratic forms and it is 
Waldspurger’s noteworthy paper [24] that made use of Siegel’s main theorem to tackle the 
‘basis problem’, for the first time. “At the other end of the spectrum”, when the weight 
k of Siegel modular forms of degree n is much smaller in relation to n (say, k does not 
satisfy the condition k > 2n but actually k < n/2), we have precisely the singular Siegel 
modular forms which can also be characterized by having all their Fourier coefficients 
indexed by (n, 1) positive symmetric matrices T (in the Fourier expansion) equal to 0; we 
know ([4, Ch. IV, §5], [14]) that every singular Siegel modular form of degree n (for I’,,) 
is a linear combination of theta series associated with even unimodular positive definite 
quadratic forms. 


§ 8 


In two powerful papers ((26], [27]), Weil used the fascinating setting of adelic analysis to 
present the analytic formulation of Siegel’s main theorem for quadratic forms (see also [21 ]) 
as an identity between two ‘invariant tempered distributions’ - the ‘theta distribution’ and 
the “Eisenstein distribution’ defined in the context of not just orthogonal groups but more 
generally classical groups arising from algebras with involution. A vital step in his proof 
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is a very general Poisson formula involving transforms F'5 of Schwartz-Bruhat functions 
® on locally compact abelian groups G and the Fourier transform Fo of F4 and the 
recognition of the measures ‘making up’ Fo. Suitable specialization of ® and G leads one 
to recover Siegel’s analytic formulation of his main theorem. Using tools from analysis 
and deep algebraic geometry such as the Hironaka resolution of singularities in the case of 
hypersurfaces arising from forms f inn variables and of degree m > 2, Igusa [9] obtained 
under appropriate restrictions (e.g. n > 2m > 4 and the variety defined by f = 0 is 
irreducible and normal) a Poisson formula generalizing that of Weil to the case of forms 
of higher degree. A local-global theorem of Birch [2] generalizing, to forms f of higher 
degree, Davenport’s results on cubic forms and concerning the existence of rational points 
on hypersurfaces defined by f/f, results as an application of Igusa’s Poisson formula (see [16] 
for a nice survey on this). 


§ 9 


Siegel’s main theorem for positive definite quadratic forms f as stated in (*) of §2 is valid 
without restriction on m while the analytic formulation in (**) of §3 valid form > 4 can 
be extended to cover the boundary case m = 4 as well by an appropriate method viz. 
Hecke’s limiting process to overcome the problem of the Eisenstein series not converging 
absolutely, for m = 4. On the other hand, when ff is indefinite, the numbers r/(f, ft) 
are no longer finite in general and have to be replaced by ‘measures of representation’ 
which are essentially volumes of relevant fundamental domains for discrete subgroups 
of the orthogonal group of f. Siegel [21] proved an analogue of (*) for indefinite f 
with m > 4 and similarly of (**) except when f is a quaternary integral quadratic form 
with square determinant and represents 0 over Z non-trivially; the left hand side of this 
analogue of (**) is given as an integral of a theta series over a fundamental domain for the 
group of (Z—) automorphisms of f while the right hand side is still an Eisenstein series. 
Siegel’s generalization of all these results of his including those covering representations 
over algebraic number fields are to be found in [21] while Weil’s Siegel formula with its 
several distinctive features surprisingly unifies the discussion of the definite and indefinite 
quadratic forms (also with rings of algebraic integers in place of Z) under conditions (e.g. 
m > 4) ensuring absolute convergence of the Eisenstein series. For indefinite ternary or 
binary quadratic forms, the analogue of Siegel’s main theorem fails to be valid [21]; distinct 
classes in a given genus of binary indefinite forms represent integrally essentially distinct 
sets of prime numbers. 


§ 10 


In [15], the Siegel-Weil formula in a range of cases not covered by [27] is established for 
representation of quadratic forms in n variables by ‘anisotropic’ (1.e. not representing 0 
non-trivially over Q) quadratic forms in m variables over Q with even integral m > n or 
with all m > 1 =n. The problem here is far more than just taking care of non-absolute 
convergence of Eisenstein series (via an analogue of Hecke’s procedure)! The proof makes 
use of Hecke intertwining maps and singular automorphic forms in the sense of Howe [8]. 
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Pfister’s Work on Sums of Squares 


A.R. Rajwade 


Historically the theory of quadratic forms was regarded as a topic in number theory. 
However, Witt’s paper “Theorie der quadratischen Formen in beliebigen Korpern” of 
1937[15] opened up a new chapter in the theory: that of combining the number theoretic 
aspect with the algebraic development, by the creation of the famous Witt ring. 

Then triggered off by Cassel’s paper “On the representation of rational functions as sums 
of squares” of 1964[4], Albrecht Pfister, about 1966, come up with his celebrated structure 
theorems, giving birth to a purely algebraic theory of quadratic forms. Special cases of 
arithmetical aspect of Pfister’s theory are his beautiful results about sums of squares and 
Pfister forms. 

Our object here is to give a brief exposition of Pfister’s work on sums of squares and 
related topics, one of the most beautiful and self contained set of results in any field (pun 
intended). 

So let K be a field. We make the following 


Definition 1. The smallest integer s = s(K) for which the equation —1 = a? ee as Sl 
a? (a; € K) is solvable, is called the Stufe (often referred to as level) of K. If the equation 
has no solution, we put s = oo and call K formally real. 


In 1932, Van der Waerden had posed the problem of enquiring which numbers can occur 
as Stufe. For example 3 can never occur as Stufe. Indeed if —1 = x*+ y?+z(x, y,z € K), 
then 0 = 1+x2 + y? + 27. Multiplying by 1 + x* gives 


0 


(1x7)? + (1+ .27)(y? + 2’) (a) 
= (+27)? + (y +22)? + (@— xy)’ (b) 


Now 1 + x? 4 0, otherwise Stufe K = 1. Hence (b) gives 
Ge: Ytxz an 2 XY ; 
A+ x? 1+ x? 
showing s(K) 4 3. 


It may similarly be shown that no odd number can be the Stufe of any field, but can 6 or 
10 or 12 be the Stufe of a suitable field? We shall have to experiment with various fields 
and then make a conjecture. The rationals Q and the reals R, being formally real, are of no 
use, the complexes C provide a trivial example: —1 = 0? giving s(C) = 1. So let us look 
at the imaginary quadratic fields. Indeed we have the following 
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Theorem 1. Let D > 0 be a square free integer; then the Stufe s(K) of K = Q (V —D) is 
| L. De 


2 ifD #8b+7, 
4 ifD =8b+7. 


If D < 0, then K is of course formally real. 


Proof: [12]. Writing D = a? +b?+c?+d*, a,b,c, d, € Z, we see that 0 = (J —D)* + 
at? +b*4+ 24 d?, giving s(K) < 4. Now s(K) = 1 if and only if J/—1 € K and this 
happens only in the case D = 1. If D # 7 (mod 8), then D is a sum of three squares and 
so 0 = (/—D)? +a? + b* +c’, giving s(K) < 3. But we have already seen that s cannot 
be 3, hence s(K) = 2, the case s = 1 being fully cleared. 

Finally let D = 7 (mod 8). If s(K) were < 4, then it would be equal to 2, i.e. 


—1 = (a, + b}V—D)* + (ay + bo —D)*, a1, b}, a2, b2, € O. 


Here, without loss of generality, we may suppose that b} #4 O. Equating reals and 
imaginaries, we get the following two equations: 


a* + as — D(b? + b3) ==] 
ajb; +arbo = 0 


Theseimply D = (yr Hoty +h( p25). Thus D isasum of 3 rational squares which 
| 2 | 2 
is acontradiction since D = 7 (mod 8). Thus s(K) not be less than 4 as required. O 


Let us next look at all the finite fields. We have the following easy 


Theorem 2. Let Fy be the finite field of q = p“ elements; then 


(F = l if either p =2 or p = 1(4), or p = 3(4), 2a, 
°"a) = ) 2. otherwise i.e. if p = 3(4),2 fa. 


Proof: First let p = 2. Then —1 = 1 = 1? giving s(F%) = 1 

Next if p = 1(4), then (+) = lie. —1 = x? is solvable in F, C Fp (for all a). So 
S(Fp«) = 1. 

Let now p = 3(4). LetA = {—1-—X?|X = 1,2,..., 5,0} andB = {Y?|Y = 
1? cence: oo 0},A, B,C Fp. Then |A| = |B] = bel Hence by the pigeon hole principle, 
there exist X9, Yo € Fp such that V5 =-—1- xe. le. —l= XG =x Yo in F,,. But —1 is not 
a square in F’, since p = 3(4). It follows that s(F,) = 2 if p = 3(4). 

Now F* = F,(/—1) and here —1 = (/—1)’; so s(F,2) = 1. But Fpe D F,, if 2 
so Fya has Stufe 1 if 2|a. 

If s(Fp«) = 1 even for 2 Aa, then —1 = X? is solvable in Fya(2 fa), so Fp« D 
F(X) = F,(/—1) = F 2 which is false since 2 Ja. Hence s(Fp«) = 21f2 Ja. CJ 


Qa, 


That more or less exhausts all the easy fields using elementary methods. Even allowing 
the Hasse-Minkowski theorem all those algebraic number fields K for which s(K) exists 
finite can be dealt with in a single go. The exact result is the following: 
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Theorem A. Let K = Q(a) beanalgebraic number field with [K:Q]=n finite. Then s(K) 
exists finite iff K is totally complex (i.e. all the zeros of irr (a, Q) are non-real) and then 
s(K) <4ie. equals 1,2 or 4, because we have seen that Stufe can not be 3 and in fact 


1 iffie Ki =J-1) 


S(K) = + iffi ¢ K and for all primes y|2, the local degree [Ky : Q2] is odd. 


For a proof see [13], p 261. 
We see that experimentation is not easy and so it was all the more surprising, when Pfister 
proved the following beautiful 


Theorem 3. For any field K, the Stufe s(K), if finite, is always a power of 2. Conversely, 
every power of 2 is the Stufe of some field K. 


We shall give a proof of this result in the sequel. In showing that 3 cannot occur as 
Stufe, the transition from equation (a) to (b) (see before theorem 1) is the crucial step in the 
process. We have, more generally, the curious looking identity. 


(XT + X5)(¥P + YZ) = (XY — X2¥2)" + (X1¥2 + X2N1)? (1) 


which tells its that a product of two sums of two squares is itself a sum of two squares. 
Known to the Greeks, (1) is equivalent to the statement, The norm of the product of two 
complex numbers Z, Z2, is the product of their norms: 


|Z1 Z|? = |Z ||Z2)° (1) 


for writing Z} = X,; + iX2, Z2 = Y, + iY2, we see that Z).Z2 = (X,Y; — X2Y2) + 
i(X1Y2 + Y2X2) and so (1) and (1’) are the same. 
This identity (1) enables us to prove another curious result: 
For any field K, the set G2(K) = {a € K*|a = x* +. y*, x, y, © K} is a multiplicative 
group. 
For, the closure property is the identity (1) while if 
xt ty? x y 


ee: la = . 
a =x? + y? © Go(K), then — = 5 =~ = (~) + (<) € G2(K). 
a a a a 


The following striking identity was already known to Euler in 1770 and he used it to 
prove Lagrange’s theorem that every positive integer is a sum of four Squares. 


(XK NY EY ey PV) (2 77) (2) 
where 
Z| = XY — X2VY2 — X3Y3 — X4V4 
Zz = X,¥2 + X2Y) + X3Vq — X4¥3 
Z3 = X1¥3 4+ X3Y) — X2¥44+ X4X% 
Za = X1V¥q+ X4¥ + X2¥3 — X3Y2 
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The discovery of quaternions by William Hamilton, in 1843, brought out the real significance 
of the identity (2) in as much as (2) is simply the fact that the norm of a product of two 
quaternions 1s equal to the product of their norms. 

Almost immediately after Hamilton’s discovery of the quaternions, Arthur Cayley, in 
1845 discovered the octonions (the Cayley numbers) which give rise to the incredible 
looking identity 


(XP +--+ XPV +--+ ¥g) = Zt. 4+ Z (3) 
where 


Z) = KY — XOYo — XGV3— X44 — A5YS5 — X6Y6 — 517 — Aske; 

Zo = X1Y02 + X0Y, + X3V4 — X4Y34+ X5V6 — X6Y5 — X7V¥R4+ X87, 

Z3 = X1Y3 + X3Y, — X0V¥44+ X4Vo + X5 V7 — X7V5 + X6VR — X86, 

Z4 = X1Y¥44 X4Y, + X0V3 — X3Y2 + X5VR — XBV5 — X6Y7 4+ X7YXo, 

Z5 = X1Y54+ X5Y) — X2V6 4+ X6Yo — X3Y7 4+ X73 — X4 Vg + XB Ya, 

Zo = X\¥oet X6Y + X2V5 — X5 Vo — X3VYQR 4+ X8¥34 X4V7 — X74, 

Z7 = X\Y¥7+ X7Y + X2Vg — Xg Yo + X3V5 — X5V3 — X4V6 4+ X6YX4, 

Zgy = X)¥g+ XgY) — X2V¥7 4+ X7Y2 + X3V6 — X6Y3 4+ X4V5 — X5YV4. 
Although the identity emerges most naturally from Cayley numbers, it was discovered 
nearly a quarter of a century earlier by C.F. Degan (1822) with minor sign differences. 

Degan stated (erroneously of course) that there is a like formula for 2” squares. For the 
case of 16 squares, he gave the literal parts of the 16 bilinear functions Z;, Z2,..., Z16 but 
left most of the signs undetermined, saying that the only difficulty is the prolixity of the 
ambiguities of signs. 

Degan was also aware of the 2— and 4— variable Pfister forms x? _ aXs and x; ae 


aXs + b(X3 +- aX?) both of which satisfy identities similar to (1) and (2). 
As before, if we define 


Ga= {ae K*la=xi+---4+x},x; € K} 


and Gg similarly, then it follows from (2) and (3) respectively that G4 and Gg are groups 
under multiplication, so that we have the chain of inclusions 


KO SGy eG, CGse 


A great many unsuccessful attempts followed Degan’s discovery of (3), to extend formulae 
(1), (2) and (3) toa similar 16 term identity, and many workers, realizing the impossibility of 
such an extension, tried giving convincing arguments to prove the impossibility. Hamilton’s 
and Cayley’s discoveries had reduced the problem to the determination of the so-called 
normed algebras over the real numbers R; the four known ones being R (of dimension 1), 
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the complex numbers C (of dimension 2), the quaternions H (of dimension 4) and the 
octonions O (of dimension 8). It is an astonishing observation how the axioms of the 
ordered field R gradually drop off as we move up these higher dimensional hypercomplex 
systems: C is, no doubt a field, commutative and associative (under multiplication) and 
a division ring, but the order property is lost. H is only an associative division ring; thus 
commutativity, and order are both lost. Finally O is not even associative-it is merely a 
division ring; thus commutativity, associativity and order are all lost. 

The half century following the discovery of these quaternions and octonions then saw 
many attempts to find a 16-dimensional hypercomplex system over the reals and several 
erroneous affirmations were given. Finally in 1898, Hurwitz [6] gave a decisive solution to 
the problem about the dimensionality of all possible normed algebras over R and so also 
about the possible values of n for which there is an identity of the type (3) with n terms. 
More precisely we have the following. 


Theorem 4 (Hurwitz-1898). Let K be a field with char K # 2. The only values of n for 
which there is an identity of the type 


Ce eres ay Gh 0 are Sree oe don Are (4) 
where the Z, are bilinear functions of the X; and the Y;, coefficients in K aren = 1, 2,4, 8. 


Actually Hurwitz proved this only over C but his proof generalizes to any field K with 
char K #4 2. We give here a proof given by Dickson in his beautiful expository paper [5] 
of 1919. A proof using normed algebras can be found in A.A. Albert’s Studies in Modern 
Algebra [2]. 


Proof: (Dickson). The idea is to convert (4) into a system of matrix equations. The 
bilinearity condition on the Zz can be written as 


4\ Qi) @\2 --* Ain f 
22 a2} A22 +--+ Ad» Y2 _ Ay 
Z, Gn} Gn2 °**° Ann Y, 
where the aj; are linear functions of X, X2,..., Xn. Then (4) becomes 
Y a 
Y2 
(XP + KYM Ka) | 2 | te = (Z1,--.,2Zn) [| 2 | = Y'ATAY, 
Z 
Y;, ‘ 
1.e. 
Y) 
2 2 4) / Y2 
© Giro bree 9) 1 O, Oaa cee, Ge cane ae, 0 eee 20 : = 0, 


Yn 
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and since this is true for all Y;, Yo,..., Y,, it follows that 


A'A = (X? 4+---4+ X01, (a) 
Now 
ay} Qa\2 ie 
= NiQ9) G97, . chs 
BX + BO Xa + BX FOP Xa 4 --- 
=f 08x, aa bola leak 
= A,}X;+ A2X2+---+AnXn Say 
By (a), 


(A,X) + Ay Xo +--+: +A, Xn)(ArX1 + A2X2 +--+ + AnXn) 
= (XP 4X5 4+--- 4+ XK. 


Since this is true for all X ;, we have 


(1) Al Aj = 1,(j = 1,2,...,n), hence also Aj A; — ie Oe 
(2) Al Ax + AL Aj =0,1< j,k =n, 7 4k. 


Conversely, the existence of such a system implies that (4) holds with Z, bilinear in the X; 
and the Y;. Note also that if n = 1, (2) is vacuous, and (1) can be trivially satisfied so we 
may suppose n > 1. 

Now let B; = A, Ai(i = 1,2,...,n—1). The B’s are easily seen to satisfy 


(1) BB; = I), 
(2) Bo+Bj=0 (i,j =1,2,....n-1) 
(3) BIB; + B'B)=0 Gj) 


Hence we have 


(1) B: = —B (i = 1,2,...,n —1) ive. the B; 

are Skew — symmetric matrices 
(ii) B? = —I(i =1,2,...,n—-1) (0) 
(ill) B,B; = —B,;B;, ae Oe (ee | ae 


It follows that |B;| = |B: = | — B;| = (—1)"|B;|, and since |B;| 4 O we must have n 
even. Hence 
Proposition 1. There is no identity of the type (4) ifn(> 1) is odd. 


In future, therefore, we suppose n to be even. Now consider the following set G of n x n 


matrices: 
(7, B;,, B;, Bis, B;, Bi, Biz, ee B;, Bip, oe re9 B;,_> 


and B, B2,..., Bn_-i (i, <n, i) <2 <n,...)}. 
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Here B;, takes n — | values viz. B,, Bo,..., Bn—1, while B;, B;, takes es) values viz. 


B, Bo, B, B3,... etc. So altogether there are | + ("7") Sie GS) = 2”—! elements in 
the set G. Let G = B;, Bj,,..., B;, € G. Then we have 


Lemma 1. G is symmetric if r = 0 or 3 modulo 4, and skew-symmetric if r = 1 or 2 


modulo 4. 
Proof: 
CoS Bhai By SY Bia sis Bi 
= (—1)'(-1)""'B;,(B;,, ..., Biz) 
by (iil) of (b) to commute B;, successively with B;,,..., Bi,, 


(—1)" (—1)"!(—1)" 7 B;, Bi, (Bi, -.-, Biz) 
and so on 

(—1)'(-1)""!... (—1)?(- 1) B;, Bin... B; 

(a 2h 2G 


r 


= Gay CEG 
= G ifr =O, 3, (4) 
— 1-G ifr=1,2,(4 


O 


Lemma 2. Let M € G. Then the set MG = {MG|G € G} is simply a permutation of G 
with each term prefixed with either +1 or —1. 


Proof: The result is clear if the multiplier M@ is B,, since then the product will contain or 
lack B, according as the multiplicand of G lacks or contains B, (use again (b)). 

If the multiplier is Bz, we first replace B, B2,..., wherever it appears, by B2B),..., 
and see that the former argument applies. 

After thus proving our statement when the multiplier is any B;, we see that it holds when 
the multiplier is any product of the B’s. CO 


An Example: n = 4. 
G = (I, By, Bz, B3, B, Bz, By B3, By B3, By B2 B3}. 
Then 
B3G = {B3, B3Bi, B;B2, BS, B3B Bo, B3 B27 B3, B3B, B3, B3 By, Bz B3}. 
(B3, —B) B3, — By Bs, —1, B By B3, Bz, Bi, —By Bp} 
= {—I, B), Bz, B3, —B, By, — Bz B3, — B, B3, By Bz B3}. 


Our aim is now the following. 


Proposition 2. At least half of the elements of G are linearly independent. 


With this in view, we look for any linear relations that can exist amongst the elements 
of G. 
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Definition: A relation A;G, + A2G2+---+A;G; =0,G; €G,A; € R, or R = O for 
short, is called irreducible if itis not possible toexpress R as Rij + R2, where R; = 0, Ro = O 
represent two linear relations that hold between the subsets R, and R2 of R with R}NR2 = BW, 
1.e. there are no matrices common to R and R>. 


We have the following. 


Lemma 3. An irreducible relation R = O cannot involve both symmetric and skew- 
symmetric matrices. 


Proof: Let M, be the subset of all symmetric matrices in R and Mp) the set of all skew- 


symmetric matrices in R. Then M; + M2 = 0,i.e. M; = —Mp>. Hence M; = M, = 
—M, = M>, i.e. M,; = Md. It follows that M; = 0, M2 = O which contradicts the 
irreducibility of R = 0. CO 


Now let R = 0 be any irreducible relation between the matrices of G. By multiplying 
R by asuitable AG (A € R, G € G) we get a new relation T = 0, one term of which is J 
and all the remaining terms are products of matrices of G by real constants. For suppose 
UG (u €E R,G é€ G) is atermin R which we wish should become /7 in the relation T = 0. 
One just multiplies R = 0 by + w~!G7! and notes that one of + G7! e€ G. 

For example if 4B B3 1s one term of R, then on multiplying R = 0 by 


I I I I 
——(ByB3)~' = ——B,'B>! = —-(—B3)(—B) = ——B3Bp = —B2B3, 
gn2e3) Tae ta ra 3)(— Bo) Ge eee 


we get what is required. 

This new relation T = O is also irreducible, for if T = 0 were to split as 7; = 0, T> = 0, 
then since T = AGR we have 1~'G~'!T = R and so R = O splits as A 'G-'T, = 
0,4~!'G~!T =0, which gives a contradiction. 

Hence we may suppose that T = 0 looks like 


T=) Ciyinis Bi, Bin Big +) diyinizig Bi, Biz Biz Big + --- 5 


where by Lemma 3, each of the matrices B;, B;, B;,, Bj, Bi, Bj, Bi,, etc. is symmetric since 
I is symmetric. That is why no singleton B; nor any of the products Bj B; of two B’s can 
be involved in (*) since B; and B; B; are skew-symmetric by Lemma 1. 

Now multiply (*) throughout on the right by B; to obtain an irreducible relation which 
then involves only skew-symmetric matrices since one term (on the left side) is the skew- 
symmetric matrix B;. But by Lemma 1, B;, B;, Bj, Bi, 1s symmetric. So all the c; are 0 if 
only i is distinct from 14/2/13. Since 1 may have any value < n — 1, we See that each c is 0 
unless n — | = 3 for theni cannot be chosen different from i), (2, i3. 

Next we show that all the d’s are 0 too; for multiply (*) by Bj, and it becomes 


Bi, = YS disizisig Bia Bi, Bip Bis Big ere 
But B;, Bj, Bj, Bi; Bi, = (—1)° Bj, Bj Bi, B?, = B;, B;,B;,. So (*) becomes 


Bi, = > | dizinizig Bi, Bin Bir fore, 
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Here B;, Bj, Bj, is symmetric, while B;, is skew-symmetric (by Lemma 1). It follows that 
all the d’s are 0 too. 

The method used in proving c = Oapplies when the number r of factors in B;, Bj,,..., Bi, 
is = 3(4) andr < n — 1. Similarly the method used in proving d = 0 also applies when 

= 0(4). 

Hence if our relation exists, it has the form 


1 =kB\Bo,..., Bn— 


the right hand term being the only survivor. Now / is symmetric so B); Bo,..., Bn—1 1S 
symmetric i.e. n — 1 = Oor 3(4), butn is even son — | = 3(4) ie. n = O(4) 
We have thus proved the following. 


If an irreducible relation between the elements 
of G does exist, thenn= O(4). 


Now square this relation to get 


1 = k*B\Bo,..., Bn_-1 BiB), .--, Bn—-1 
— k*(-1)""7?Bo... Bn_-1 Bo... Bn—1 


— K2(—1) 2-Day, 
Since n = 0(4), we see that k* = Lie. k = +1. Hence we have the following. 


Lemma 4. [fn = 2(4) then the 2"—! matrices of G are linearly independent, while for 
n = 0(4), they are either linearly independent or are connected by the relations which arise 
from the relation I = +B, B2,..., By—) through multiplication by the various elements of 
G, but are connected by no further irreducible linear relations. 


Example: letn = 4. Then 
G = {/, Bi, Bo, B3, B\ B2, B2 B3, By B3, By Bz B3} 


and these eight matrices are either linearly independent or are connected by the following 
four irreducible linear relations and no others: 


1 = + B, B2B3, By = F By B3, By = + B, Bs, Bz = F By, Bo. 


These express B, Bz B3, Bz B3, B, B3, B, B2 linearly in terms of J, B,, Bz, B3; so that these 
latter matrices are, in any case, linearly independent. 

Now consider all the irreducible linear relations that exist between the element of G. As 
we have seen, they are all of the type 


G-1]=+G.-B,B,..., Byh-1(G EG) 


and no others. Now reduce the right side of this using (b). Then one of G or the reduced 
right side obviously contains fewer than half of the B’s while the other contains more than 
half of the B’s. 
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Thus these irreducible linear relations merely serve to express the products containing 
more than half of the B’s in terms of those with less than half of the B’s. 

So in every case (i.e. irrespective of whether n = 0 or 2 (mod 4)) the 2”~? matrices of 
G, which are products of less than a! B’s, are linearly independent. Hence for all values 


of n (necessarily even) if there is to be an identity of the type (4), the 2”~* matrices of G 


consisting of the product of at most ns B’s are linearly independent. 


This completes the proof of Proposition 2. O 


We can now give a proof of our main result. 
The elements of G are all n xn matrices and the maximum number of linearly independent 
n x n matrices is n* since they form, over the reals, a vector space of dimension n?. Hence 
by the proposition we get 
2 ean 


This is satisfied if n < 8 but fails ifn = 10. Now if it fails for n = m, then it fails for 
n = m-+ 1 for we have 


gmt+i—-2 _ 9.9m-2 5 2. m? (since the relation fails for m) 
> (m+ 1)? ifm > 3. 


It follows that if an identity of the type (4) exists, thenn < 8 (andn is even). Forn = 2,4, 8 
we already have the required type of identities. It remains to dispose off the case n = 6. 
Suppose an identity exists for n = 6. Then since 6 = 2(4), we see that 


the 2° matrices of G are linearly independent. (i) 


Of these 32 matrices, 16 are skew-symmetric by Lemma |, viz. the ones that are products 
of 1,2 o0r5 B’s. But 


(11) 


between any 16 skew-symmetric 6 x 6 matrices 
there exists a linear relation. 


This is because the 15 matrices 


with a | in the one place above the main diagonal, —1 in the corresponding place below, 
and 0’s elsewhere, form a basis for the subspace of all 6 x 6 skew-symmetric matrices and 
so this subspace has dimension 15. This proves (11). 
(i) and (11) above are contradictory. Hence no identity of type (4) can exist forn = 6. 
That at last completes the proof of Hurwitz’s theorem. CO 


Remark: The proof works for any field K of characteristic 4 2. 


Although the impossibility of the identity (4) for n 4 1, 2,4, 8 has been proved, it was 
under the stringent restriction that the Z, are bilinear polynomials in the X; and the Y;. One 
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could look into the possibility of the existance of other values of n for which (4) holds, if 
we relax this bilinear condition and allow the Z; to be more general polynomials in the X; 
and the Y;. However, in 1966, Frank Adams [1] showed that when n is not I, 2, 4, 8, there 
are no identities of the type (4) even if the Z, are allowed to be any bi-skew, continuous 
functions of the X; and the Y; (where a mapping f : K’ x K* — K" is called bi-skew if 


f(-x, y) = f(, —y) = —f(@, y) forall x € K", y € K°). 


It was thus totally unexpected when in 1965, Albrecht Pfister [10] proved the following 
remarkable. 


Theorem 5. Let K be a field and letn = 2” be a power of 2. Then there are identities. 
OC eres BaD 6410 Cat eee a ae eee 7 (5) 


where the Z, are linear functions of the Y; with coefficients in K(X\,..., Xn): Ze = 
22) T kj : Y; with kj E€ K(X,..., Xn). 

Conversely suppose n is not a power of 2. Then there is a field K such that there is 
no identity (5) with the Z, € K(X,...,Xn, Yi,..., Yn). Here the Zy are not even 
demanded to be linear in the Y;. 


We shall now give proofs of theorems 3 and 5. In the process we shall get other results 
which are interesting in their own right. 

The proof of the first part of theorem 5 requires no elaborate algebraic machinery and is, 
indeed, remarkably simple. We dispose of it first. 


Proof of the first part of theorem 5: We use induction on m. We know that (5) holds for 
m = 1, 2,3 (see (1), (2), (3)). Suppose it holds for m. Write T = (7;j;) so that 


ZA Y) 
22 Y> 
=f (1) 
Zn ) 
Then (5) can be written as 
Y; 
(X?4-.-4 X2 (v2 4+..-4 V7) = (X72 +--+ KK. Fa 
Yp 
Z) 
S(Z7 tee Z) = ZutaaZa = 
Zn 
Y; 
=(¥,..-,¥n)T’'T| : |, by (ie. 
Yn 
Y\ Y) 
(KF Ae K Yscue aint Pe OQnusl¥nt Tl = | S0 


Y, Yn 
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or 


Y\ 
(Y1,..., Ya)((X7 +---+ X2)I, -—T'T}]{ + | = 0. 
Yn 


Since this is true for all Y;,..., Y,, we must have 
TT’ = (X¢+---4+ X21, 


and so also T’T = (x? + +--+ X?)],, T being orthogonal. 
We now prove (5) for 2"+! = 2n. Write 


(X1,..-, Xan) = KO, XM) 
where X") = (X,..., X,) and X® = (Xn+1,---,X2n). By the induction hypothe- 
sis there exist two matrices T!), T® say, corresponding to X“, X®) respectively 


such that 


(xX? fee arte X-\i, — xx", —TOpdy’ — pO’ 
and (i) 
(X2,, +--+ X32) = XOXOZ, = TOTO = 77 


and we wish to show that there exists a matrix T say, such that 
T'T =(XP +--+ X24 X70, 4+---+ X3,) Dn. (iii) 


T) 72 - ) 
Try T = (Fe xy |} partitional matrix, where X will be determined by (iii). 


We have 
TU p2y TD) 7 
TT — (Fer a T 2) Xx ; 


and using block multiplication of matrices this equals 


TUTTO 4 TATRA) PO TR) 4 7x 
T2YTO 4 XT) T 2) TQ 4 XX ) 


2 p 2 Z 
os pict Ey ee ere Aa of 


say; we want to choose X so that A = B = 0 and 


C= (XP +--+ X54 Xia t-- + X35, hn. 
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1-1 / ‘ ‘ 
To make A = 0 we have to have X = —T® > TT). This automatically makes 
B = 0 (just check). Now it seems too much to expect C to be what we want. But we have 


Cc 


TT 4 7 pOTW! pW" pO ® 
0.6 Cy re a6 ne eee 9 Ce ay RC Ow lla BG 
= (Xn te + XGa dn + (Xing bo + XGq) | 
eG Sent aie X2)T AT?) 
= (Xp yp te + XG, nt (XT +2 + XPIn 
= (Xf te + XP 4X0 4---4+ X35, )In- 


This completes the proof of the first part of Theorem 5. C 


We now come to a result of Cassels [4], which, in a way, was the starting point of this 
whole business and which is an indispensible tool in our further developments. 


Cassels’ Lemma (1964). Let f(X) € K(X) be a polynomial with coefficients in K. If 
f (X) is a sum of n squares of elements of the field K(X), then it is a sum of n squares of 
elements of the ring K[X}. 


Note: What is new in this enunciation is that the same number n of squares suffice in K[X]; 
without this condition, the result had been proved by Artin [3]. 


Proof: There are three trivial cases of the lemma which we dispose of first. 


(i) n = 1. Then f(X) = (p(X)|q(X))’, so q(X)| p(X). 
(ii) char K = 2. Then a* +b? = (a+b)? and soif 


F(X) = v2 (KR 4--- + 97 (X) 


then combining two squares at a time into one, f(X) reduces to a single square, 1.e. 
we land up in case (1). 
(iu) —1l is asum of n — | squares of elements in K. 


Say —1 = bt +++) 4+b2_,. Then for any f(X), we have 
ee eet 
00 = (7) - (7) 
ately aie eb Gap) 
= (FE) + (6G) tt (os E) 


a sum of n squares of elements of K[X]. 
So now let us suppose none of these three cases holds and let 


F(X) = (p(X) /qi(X))? + + (pn (X)/an(X))?. 


Dropping the X from now on and clearing the denominators, this gives 


fZ?=YP4+..-4+Y¥7, Z,M,...,¥n © K[X],Z 40. 
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Thus the equation 
FZ H¥i tet +Y, (a) 
has a solution (Z, Y,,..., Y,) with Z 4 O and we have to show that there exists a solution 
of (a) with Z € K(Z #0), 1.e. with degree of Z (in X) = 0. Now since (a) has a solution 
with Z # 0, so there is a solution, call it (€,71,..., 7), with ¢ 4 O for which deg ¢ is as 
small as possible: 
fer-anite-tn. (b) 
We shall show that this degree is 0 i.e. that ¢ € K, by showing that if not, then there exists 
a solution, say (€*, nj,..-.,) with ¢* 4 0 and deg ¢* < deg ¢. 
So suppose deg ¢ > O. By the division algorithm in K[X], we can write, for j = 
| ee ere js 


Viens ae 2 
where either y; = 0 or deg y; < deg ¢. Le. 
nj/E =Apryj/b =AjP+ Aj, (c) 


say. Note that not all the y; can be zero, otherwise ¢ divides all the 7; so that (b) becomes 
f= At Se r2 -a contradiction, since the degree of ¢ was least possible. 


Now let 
cr =o lyoz—| -2) Dun - Fe} 
and | | 
nj = Nj {yo 3 i — 2A; {So ain — fe}. 
Then visibly, all of €*.*,..., ” € K[X]. We now claim 


(i) that (¢*,n},...,7,) 1s a Solution of (a), 

(ii) ¢* #0 and 
(ill) deg ¢* < deg €. 
This would then contradict the definition of ¢ and so would prove the lemma. 
We prove (i) by brute force: we must show that ar ne — fcr = O1.e. that 


= [ete seer 
nf fooe-af 
— f | c? {rots} alsa rc} 


«feed loee] 


2 
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Here the first terms from both sides cancel since )° n* = f¢? and it remains to prove that 
{30m re (Scam - ft) 93 - (x91) am 
i i j 
- (Dan ~ i) f+(>07- 4) “fe =0 


Here the expression in square brackets just cancels out. 
To prove (ii) and (iii) we substitute for A; from (c) in ¢*. Then 


(E6929) (EE-98 


i 


c* 


= CCAP +-+++ AZ) (using fo? = nf +--> +04) 


Sy it aE 


Here not all the y; are zero (as already noted) and so >> y? is non-zero since otherwise by 
equating the coefficient of the highest power in X to 0, we find that 0 is a sum of at most n 
squares of elements of K, which is the third trivial case of the lemma. Thus ¢* 4 0, which 
proves (11). 

Finally ¢* = 1/¢ >); y? giving ¢¢* = )o. Ve. Equating degrees, we get deg ¢+ deg 
¢* = 2max; (deg yj) < 2 deg ¢ since deg yj; < deg ¢ (for alli). Thus deg ¢* < deg £, 
which proves (111). 

This completes the proof of Cassels’ lemma. LO 


Remark: The solution (¢*, ;, ..., ;,) does not just come out of the blue. Itis the second 
point of intersection Q of the quadric (a) with the line joining the points P = (€,71,...,"n) 
(on the quadric) and P’ = (1, 41,...,A,) (in space) in the n-dimensional projective space 
over the field K (X). The simplest way to get this point Q is as follows: a general point of 
the line PP’ is 


(O€ + &, On| = pat, tees Onn Ag Prn) 


0/g being a parameter for various points, g = 0 giving the point P. To get Q we substitute 
this general point in the quadric (a): 


a} 
f(0°C7 + y" + 2096) =) (6?n5 + y°A5 + 209NjA;) 
j=l 


ie. O7(67 f — Yin?) + 209(ft — Dain) + e7(F — 44) = 0. Bute?f = Ving 


so this becomes 209(f¢ — >> Aj;nj) + On = > 44) = 0. This has a root gy = O as 
expected giving the point P. The other root is 6/g = —()_ dé — f)/2Q 0 4jn; — of), 
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and substituting this in the general point and multiplying by a suitable factor (allowed in a 


projective space) we get our point Q as required. 
We now deduce a few corollaries from this lemma. 


Corollary 1. Let char K 4 2 let f(X1,..., Xm) € K(X1,..., Xm) be a sum of n squares 


of elements of K(X 1,..., Xm). Let a,,a2,...,Am © K be such that f(a,,...,am) is 
defined (i.e. the dominator is not 0). Then f(a,,..., am) is a sum of n squares in K. 
Remark: The point is that although f(X,,..., X;) is defined at (a), ..., Am), it may well 
happen that the summands fF (XK, ..., Xm) of the right hand side of f(X1,..., Xm) = 
fe Se a ie may not be defined at (a,,...,@m), but still according to the corollary, 
f(a,,...,@m) 18 asum of n squares in K. 


Proof: We use induction on m. For m == 1, we have 
F(X) = 9(X)/W(X) = yp (X) + F(X). 


Then gh = (yh)? +--+ (yph)?. Thus gh, which is in K[X], is a sum of n squares in 
K(X) and so by Cassels’ lemma, it is a sum of n squares in K[X]: 


gh = fp +--+ f? (fj © KIXD. 


Hence 2(X)/h(X) = (PS) f--- + (pe)? Now by hypothesis, f(a) = g(a)/h(a) 
is defined; i.e. h(a) 4 0, so each f;(a)/h(a) is defined. 

Let now m > 1. Let L = K(X1,..., Xm_—,). ASsume the result for m — | variables and 
let g(X1,..., Xm)/h(X1, ..., Xm) be a rational function which is a sum of n squares in 
K(Xj,..., Xm). Regard g/h as arational function of X,, belonging to L(X,,). So by the 
casem = 1, wesee that 2(X1,..., Xm—1,@m)/hA(X1,..., Xm—1, Gm) 1S asum of n squares 
inL = K(Xj,..., Xm—1). So by the induction hypothesis g(a), ...,@m)/h(a},...,@mn) 
is a sum of n squares in K. This completes the proof of the corollary. CL) 


Corollary 2. Suppose n = 2". Let Gy, be the set of all non-zero elements of K which are 
sums of n squares in K. Then Gy is a group under multiplication. 


Proof: Let wB € Gp say, a = at +---+ a7, B = Bo +--+ + Bp. Then a! = a/a? = 
(a /a)* +---+(a,/a)* € Gy and it remains to prove thataB € G,,. Consider the identity 


OG pert See eS ae EZ 


1 


which exists since n = 2”. In this let X} > a,..., Xn — Qn, Y1 > Bi,..., Yn > Bn- 
Then the left side is well defined and equal to af and so by Corollary I, the right side is a 
sum of n squares of elements of K, i.e. a@B € G,, as required. C 


We see that it is the identity (5) that does the trick. 
We can now prove the first part of Theorem 3: s(K) is always a power of 2. 
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Proof of the first part of Theorem 3: Let 
n=" S5(K) <2" (*) 


Then aj +---+ap+a7,,+---+a2?+1=0(aj € K).LetA=ait+---+@2,B= 
ae +.-- -+a? +1. Here A, B are both non-zero, otherwise s(K) < s. Also A, B, both € 
G,, (by adding a suitable number of 0’s to B if necessary). Then A+ B =0so A =—B 
i.e. —1 = B/A € Gy since Gy, is a groupie. —1 = e +---4+c? giving s(K) <n. 
Comparing with (*), we get s(K) =n = 2”. 


To prove the remaining parts of Theorems 3 and 5 we need to deduce some more 
corollaries from Cassels’ lemma; see [4]. 


Corollary 3. Let char K # 2. A necessary and sufficient condition for X* +d € K[X] to 
be a sum of n squares in K(X) (and so in K{X] by Cassels’ lemma) is that either 


(i) —l isasum ofn — 1 squares in K or 
(ii) d isa sum of n — 1 squares in K. 


Proof: If —1 = be +.---+b?_,, then for any polynomial f(X) € K[X], we have 


n—1? 
ftl = f-1 2 
i) 
Pe. fag =) bn—i(f —1)\° 
(ae a or ee acres (es ae 


In particular X° + d is a sum of n squares. 

If d is asum of n — 1 squares then visibly X* + d is a sum of n squares in K[X]. 

For the converse, suppose X * + d isa sum of n squares in K[X]. If (i) holds, well and 
good; otherwise let X* +d = pi(X) Se p2(X) say. Here we may suppose the p ;(X) 
to be linear poynomials in X for if not, then equating to 0 the coefficient of the highest 
power of X gives (i). Then 


f 


X* +d =(a,X +b)? +--- + (anX + bn)’ (+) 


Now one of the equations C = +(a,C + b,) is always solvable in K. For if a, 4 1 then 
C = +(a,C + b,) 1s solvable, while if a, = 1 then C = —(a,C + b,) is solvable since 
char K € 2. Now put X = C in (x): 


C7 +d =(ajC +1)? +--+ + (an_1C€ + bn—1)? + (an€ + bn)’. 


Cancelling C* with (a,C +b,)* we see that d is a sum of n— | squares in K. This completes 
the proof. C 


Corollary 4. Let R be the field of real numbers. Then X ; +---+X? isnotasumofn—\ 
squares of elements in R(X,,..., Xn). 
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Proof: We use induction on n. For n = 1, the result is trivial. So suppose the result is 
true forn —1. Let K = R(X),..., Xn_-1), Xn = X andd = x? +.---4 ) caer If 
x ie a x? is asum of n — | squares in K(X) = R(Xj,..., X,), then by Corollary 3, 
d= x i > ae is a sum of n — 2 squares in K, since —1 is clearly not a sum of 
n — 2 squares in K-indeed not a sum of squares at all in K, which is formally real. This 
contradicts the induction hypothesis and completes the proof of Corollary 4. C 


We are now in a position to complete the proofs of the remaining parts of Theorems 3 
and 5. 
Every power of 2 is the Stufe of some field K. 


Proof: Letn = 2” andlet K = R(X),..., Xn41, Y) where X1,..., Xn41 are independent 
transcendentals over R and Y satisfies the equation 
Yak) eX 1 20 (i) 


We claim that s(K) =n = 2”. In any case by (1), s(K) <n-+ 1 and so is at most n since 
n+ 1 cannot be a power of 2 whereas s(K ) is (except in the trivial case n = 1 1.e. m = 0). 


If s(K) <n then there exist t),...,t, € K, not all zero such that 
te4..-427=0 (ii) 
Let L = R(X),..., Xn41) so that K = L(Y). By (i), Y is algebraic over L of degree 


2 and so each clement of K is a linear polynomial in Y with coefficients from L. Write 
tj; =aj + Yb;,a;,b; € L. Then by (il) we see that 


Yai +¥*S bi =0 
> ajb; ee @ 


Here not all the a; are zero, otherwise >_ b* = O and so each b; =Osincetheb; e L = 
R(X, ..., Xn41) 18 formally real. Then each t; would be zero which is not true. Sitnilarly 
not all the b; are zero. Hence 


and 


_y2 


n n 
a a’ / >. bt € G,, (by the group property of G,) 
j=l j=l 


cf +--+ +2, sayc; € L, 


1.e. x aera bere is asum of n squares in L which contradicts Corollary 4. Thus s(K) 
is not less ia n and so equals n. This completes the proof. O 


Remark: The proof also works for Q(X], ..., Xn+41, Y). 


Finally we prove the remaining part of Theorem 5. 
Suppose n is not a power of 2. Then there is a field K such that there is no identity 


OG PX ee ee 
with Zj =a. @ ©.C eres, Cre 4 roe oe 


Pfister’s Work on Sums of Squares 343 


Proof: Let 2"—! < n < 2”. Let K be a field having Stufe 2” = v, say. Then 
aj+++-+aptae 4---+a5+ = 0. Let A =a?+---+a?, B =ar,+---+ap+l; 
hence A, B € Gy and if an identity of the above type exists, then Gy, is a group (see the 
proof of Corollary 2). So —1 = B/A € Gy, te. -—1 = or Sie Cc (C; € K) hence 
s(K) <n < v. But K was chosen to have Stufe v. This gives a contradiction and so 
completes the proof. OC) 


Remark 1. In our examples of fields with high Stufe both the fields R(X), ..., Xni1, Y) 
and Q(X, ..., Xn+1, Y) are of high transcendence degree over R or Q as the case may be. 
We have the following 


Problem: Does high Stufe always imply high degree of transcendence? (over R or Q). 


Let us now go back to the identity (4): 
(Xe fee GRAV P eect V2) Se (Zea Ze) 


where the Z, are bilinear functions of the X; and the Y; with coefficients in the field K. 
There are three obvious ways of generalizing this identity (one of which we have already 
looked at in theorem 5). They are 


a) Allow the Z, to be the rational functions of the X; and the Y; : Zy € K(X1,..., Xn, 
Y,,..., Y,). Then, as we have seen in theorem 5, such identities can be found for 
each power n = 2”(m = 0, 1, ...) of 2 and for no other value of n. 

b) Consider the (r, s, n) identity 


(XP HAAN HY = ZP + Zi 6) 


with Z, bilinear in the X; and the Y;, and determine, for given r, s the least value of n for 
which (6) holds. We could, alternatively look for the maximum value of r, for given s and 
n, for which (6) holds. 

For general values of r,s, n, little is known about (6). However, for s = n, Hurwitz and 
Radon gave a solution of (6) in 1922-3, for the field R of real numbers. Before giving the 
exact statement of the Hurwitz-Radon theorem we make the following 


Definition 2. We say the triple (r, s, n) is admissible over K if (6) holds. 


Thus (r, s, rs) is trivially admissible over any field K; so that what we want is the most 
economical n for which (r, 5,7) is admissible for a given pair r,s of integers. In view of 
this we have the 


Definition 3. We denote by r*s (or rather r,s) the least n for which (7, s, 2) is admissible/K. 
We have the trivial bounds. 
max(r,s) <r*s <r.s 


It is not easy to determine r*s, even for small values or r, s. 
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Alternatively, as already said above, we could ask, for given s, n the maximum value of 
r for which (r, s,) is admissible/K. This is the approach adopted by Hurtwitz and Radon 
in their treatment of (6) for the field R of real numbers. Simultaneously, Hurwitz solved 
(6), in this special case, for the field C of complex numbers, published posthumously in 
1923. Various authors have since dealt with other fields. 

As illustrations of definitions 2 and 3, we have the following: 


Examples: 


(i) (n,n,n) is admissible over R, indeed over any field K,char K  2,iffn = 
1, 2,4, 8. Thus is Hurwitz’s theorem (Theorem 4). 
(ii) (1,n,n), and indeed (r,s, rs), is admissible for all n, r, s over any field K. 
(iii) If charK = 2, thenrxxs = 1 for all r,s for then a* + b? = (a+b)’. 
(iv) 8 * 8 = 8 for max(8, 8) < 8 *8 < 8. Similarly 4 « 4 = 4 and 2 * 2 = 2. 
(v) The 16-square problem: Before Hurwitz, studies about the (r,s, )-identites (6) 
were exclusively restricted to the polynomial ring Z[X ,..., X;, Y1,..., Ys] over 
Z. One then speaks of the (r, 5, n)z-identities. It has recently been confirmed that 
16 *z 16 = 32, thereby completing the solution of the so-called 16-square problem 
in the integer coefficient (case see [16]). However, the integer v = 16 *p 16 is not 
known to date. Various methods developed by K.Y. Lam and J. Adem narrow down 
the range of v to 23 < v < 32. The values 23, 24 were subsequently ruled out by 
Lam and Yuzvinski. By going more deeply into the geometry of sums of square 
formulae and using sophisticated algebraic topology, it has now been established 
by Lam and Yiu that 29 < v < 32. 


It is trivial to see that v < 32; indeed 
16 16 8 16 8 16 
(x27) (247) = (3+ 083) (4+) 
1 9 9 
which is a sum of 32 squares, using the 8-square identity four times. 


(vi) Amongst small values of r, s,n, even (10, 11, 25) is not known to be admissible or 
otherwise. 


Definition 4. For any positive integer n, define the so-called Radon function p(n) as 


follows: 
Write n = 2” - u(u odd); then 
2m + | 0 
p(n) = als according as m = I (modulo 4). 
2m 2 
2m + 2 3 


Equivalently write m = 4a + b,0 < b < 3; then 
p(n) = 8a+2?. 


We now have the following 
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Theorem B (Radon, Hurwitz-1922, 1923). The triple (r,n,n) is admissible over the 
field of real numbers (indeed over any field K, charK #2) iffr < p(n). 


For a proof see [13], p 127 or [14]. 
(c) Instead of a “product formula” (4) for the form X ; +..-+X?, look for such a formula 


ne? 
for more general quadratic forms q(X;,..., Xn); i.e. determine other quadratic forms 


q{X1,..., Xn), if any, for which 
q(X1,. cas 39 Xn) eq(%1, rig 8189 Yn) = q(Z}, STS 39 Zn) (7) 
where Z; € K(Xq,..., Xn, V1,-.-, Yn) 


Examples: (i) If g(X1, X2) = 0a + aX (a € K), then we have the curious identity (cf. 
identity (1)) 


(Xi +.aX3)(¥? + a¥3) = (X1¥1 +. aX2¥2)? +a(X1¥2 —aX2¥))’. 
(ii) In 4 variables, we have the striking identity (cf. identity(2)) 
(X? + aX} + bX% + baX})(Y)? + a3 + bY? + baY}) 
= (X,Y) +aX2Y2 + bX3Y3 + abX4Y4)° 
+ a(—X1 Yo + X2¥; — bX3¥4 + bX4¥3)" 
+ b(—X 1 ¥3 + X3¥, +aX2¥4 — aX4Y2)’ 
+ ab(—X,Y¥4+ X4Y, — bX2Y3 4+ bX3Y>)? 
Pfister has given a complete solution of (c). He shows that for every power n = 2” of 2. 
there is this form inn variables generalizing the forms X? +aX3 and ». +a X5+bX3 +abX; 
by an obvious induction, which satisfies a product identity. These are the so called Pfister 


forms. Further there are no other forms that satisfy a product formula (7). For a proof of 
these results see [10], [11], [13]. 


In the identity (5), we have proved that the Z,; may be chosen linear functions of the Y; 
with coefficients in K(X,,..., X,,): 


Hl 
Zp = 2 Tj ¥; with Tj € K(X1,..., Xn). 
j=l 
For n = 2,4, and 8, these 7;; are linear in the X; as well. It is natural to enquire how 
simple we can take the 7;; as functions of the X;. By theorem 4, they can not all be taken 
linear in the X; nor indeed polynomials, by Frank Adam’s theorem so that some of them 
at least have to have a denominator: but can we make at least some of them linear forms in 
the X;? 
Let us first see what we can do with the first term 7, and prove the following 


Theorem 6. Letn = 2” and let X\...., D Cras ik ae Y, © K. Then 
(XP +e + XAYP He EHD = (XP tet XaNn) + ZU 40+ Z 
for some Z2,..., Zn € K. 
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We first prove: Let n = 2” and letc = eG + +--+? (cj € K). Then there exists an 
n Xn matrix S with first row (c),...,C,) such that SS’ = S’S = cl. 


Proof: First let c = 0. If all the c; = 0, we take S to be the zero matrix. 


So suppose, say, c) # 0. Let R be the row vector (cj,...,C,) and take S$ = cy R’R, 
which has first row R as required. Further 


SS’ = c['R'Rcy'R’R 
cy? R'(RR’)R 
— 0, 


since RR’ = c + ---+c? =c =0. Similarly S’S = 0 and the proof is complete. So we 
may now suppose that c € O and we proceed by induction on m. 
Write 
R= (c1,...,Com) = (C1, ..., Com-1, Com—14.45 +++, C2) 

(R, ) R,). 
Leta = ei a + -11D = Ceca + +3, so thatc = a+b. Here since c # 0, 
so a, b cannot be both zero; say, without loss of generality, that a # 0. By the induction 
hypothesis, there exist square matrices S,, Sz of size 2™—! such that 

SiS} = y Sy = Alnm-i 

S28, = $582 = blym-r. 


Furthermore the first row of S; is (cj, ..., Com-1) and that of S2 is (Com-lypoeees Cy). Now 


let 
S= S] S2 
—a'S'S5S, Si) 


This has first row equal to R as required and an easy matrix computation gives SS’ = S’S = 
cIy, €.g. 


er as S) Sr\ (S; -a7'S) S28) 
—a'S'S5S; Si) \S; S| 


7 Alym-1 + bIym-1 —a~'aSoS8) + S28; 
~ aT!) Shalgm-1 + SS a~* 81 S58) Si) S081 + S\ Si 


_ Clym-1 0) aay 
7 0 bier tabea yA” 


Proof of Theorem 6: Write 


Xj+--+X 
Y= ¥e+...4Y7, 
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Then there exist n x n matrices U, V such that UU’ = U’'U = XI,, VV' = V'V = YI, 
and 


U has Ist row = (Xj,..., Xn), 
V has Ist row = (Yj,..., Y,). 


Then 


XYI, = XVV’=V(U'U)V’ =(VU'\(VvU’) = V(U'U)V’ = WW’, 
where W = VU’. 


This equation says that if (Z,,..., Z,) 1s the first row of W, then XY = Zz; forse tf Viens 
But since W = VU’, we have Z; = Xi ¥; +-:--+ Xn¥n- O 


Theorem 6 enables us to give another proof of the important group property of the set. 
G, = {aeé K*|a = a? + .-. + a2, a; € K}, when n = 2”, a power of 2 (see corollary 2 
after cassels’ lemma). For, theorem 6 implies closure of G, under multiplication while as 
before a~! € G, whenever a € Gy. 

Going back to our enquiry about how simple we can take the Z, as functions of the X;, 
we now State the following striking result of Shapiro: 


Theorem 7 (Shapiro-1978). Letn = 2” and let K be any field. In the n-square identity (5) 
(XP? 4.4 X2(¥2 +--+ 72) = Z24...4 2? (x) 


with Z* linear in the Y; with coefficient in K(X1,..., Xn), the first terms Z;,..., Zr of 
the right side of (5) can also be taken linear in the X; iff r < p(n), the Radon function! 


For a proof of this result, see [13], page 183. 

Incidentally, we note that in («) above we can easily arrange a formula where 8 of the 
Zx are bilinear (when n > 8). To do this start with the known (8, 8, 8) bilinear identity 
and apply the “doubling” process given in Pfister’s theorem 5. Indeed write the 8-square 
identity twice over, once for the variables X1,..., X8; Yi,..., Ys; Z1,..., Zg and once 
for X9,...,X16; Yo,-.--, Yi6; Zo,.-.-, Zi6. Thus 


Z\ X, —X2. —X3 —X4 —-X5 —-X6 —-X7 —XeB 
Z2 ” Xr XX, —-X4 X3 —-X6 = X5 Xg —X7 
Ze Xg X7 -Xe —Xs Xa X3 -X2 —-X 
Y| 
Y2 


348 A.R. Rajwade 


and 
Z9 Xo —X19 —Xy —X12. —-X13° —-X14 —-X15 —X16 
Z\0 _ | X10 Xo —X12_. Xy —-X14 X43 X16 —X145 
Zi6 Xi6 = X15) X14 -X130— X12 Xin X10 X99 
Yo 
Yio 
Y16 


(simply read off the identity (3)); say Z, = S,Y, and Z, = S2Y, for short. Then 


Zi\ (S51 So Y 
4 7 (<; -Sy St nd (;) 

by Pfister’s Theorem 5. We see that Z;, Z2,..., Zg are bilinear in the X; and the Y; 
as claimed. The process can be repeated for 32,64,..., variables; the 8 bilinear terms 
will persist. 

But of course even for n = 16, Theorem 7 is stronger than the above method as it gives 
us nine fully bilinear terms. 

This problem was posed by Baeza and solved by Shapiro in a letter to Baeza in 1976. 

Pfister has other very interesting results about Hilbert’s 17th problem in the function 
fields R (X, Y) and more generally in R(X|, X2,..., X,). We refer our readers again to 
[13], [9]. 
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Notes on the Prime Number Theorem-I 


K. Ramachandra 


To Professor A.A. Karatsuba, Yu.V. Nesterenko and A.B. Shidlowsky 


§ 1 Introduction and Notation 


We begin by stating the Prime Number Theorem in a way somewhat different from the 
usual. Let p, denote the n-th prime (viz. py = 2, p2 = 3, p3 =5,...). 
Then py, is given by the approximate formula 


ie du 
n= 
> logu 


Of course this is only an approximation. The Prime Number Theorem in the usual form 
I! 


states that the error here is O(p? log p,) (on Riemann hypothesis namely ¢(s) 4 0 if 
Re(s) > 4) and O(pp(exp((log pn)3 (log log p,)~3))~“) (unconditionally) u.c., where 
wu > O is an absolute constant, which is not very important (O(...) means a term whose 
absolute value is less than a constant times...). Let y > 2 andli y = 15 ae Then the 
equation x = /i y has a unique solution say y = f(x) with f(0) = 2. From x = /i y it 
follows that 1 = (log y)7! a and hence f’(x) = log y. From these it follows (by inverting 
the approximation for p, stated above), that 


1 i) 
Cn2(logn)2, on RH, 
n)| < l 
ae boa U.C., we 


where C and D are absolute positive constants and FE = exp((log n)s (log log n)73 ). (Of 
course on the assumption (€(s) = o(r-o (log t))4)(t = 2. 5 <o < 1) wherea> | 
and A > OQ are absolute constants there follow appropriate u.c forms of the error terms in 
PN T and the above things can be formulated in terms of such error terms also; the function 
E above corresponds toa = 3). Also a famous work of J.E. LITTLEWOOD states that 
a(x) defined as )— pee satisfies 


| 
2 
m(x)=lix+Q4y (iE; oe logog 


I 
this means that the upper and lower limits of 2 (x) — li x divided by a log log log x as 


x —> oo are respectively positive and negative. From this it follows (on putting x = p, and 
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inverting) that 
— f(a) = Q4((n log ny? log log log n) 


with a similiar meaning (for Q4) as explained just now. Thus on RH we have a very precise 
result for p, — f(m) namely both O and Q4 results nearly of the same order of precision. 
What is usually customary is to state the results for )~ pi< x log p, because this admits of a 
nice formula (called explicit formula) in terms of the zeros of f(s). It should be mentioned 
that all the results on p, stated above are equivalent to the corresponding (O and Q +) 
results on ) | ,n—, log p, (x) and so on. 

The results of this paper are fairly well-known to experts working in this area. (It is 
meant for non-experts). The only justification for their publication is that we give a unified 
(and further generalised) version of the prime number theorem (on the number of primes 
< x) and Landau’s theorem (on the number of numbers < x, which are either squares or 
sums of two squares) with the Vinogradov’s error term. (There are also other results such 
as Montgomery- Vaughan theorem on square free numbers). The nist step (see §2) is that 
for any generalised Dirichlet series F(s) (defined by °°, anA7°,s =o + it) satisfying 
some conditions we prove 


l C+il x 1 
a F(s)—ds = San + E(T) (c=14+ -o2sT ss] (1*) 
s log x 


2nt JC-iT oe 
we 


where E(T) = O(xT~! exp((log .)*)) for every fixed € > 0 and the O-constant depends 
only on €. (If f| is any complex number depending on some parameter and f> > O then 
fi = O(f2) will mean that | f; i | is bounded above. Some times we write O ... (f2) to 
indicate that the bound depends on the constants. . .). The next step (see §3) is to prove that 
F(s) ts analytic in the rectangle (o > 1 — A(T), Co < |t| < T) for a constant Cy > O and 
a suitable small positive function A(T) and establish the bound 


F(s) = O(exp((log x)*)) (2) 
on the boundary of this rectangle. The net result is 


da, <i an = M(x) + Eo(x, T) 
with Eo(x, T) = apace: x)7€))(T~! + x AM )) (3) 
where M(x) = sh tx Fos) < ds 


K being the contour obtained by joining 1—A(T)—iCop, C—iCy, C+iCo and |—A(T)+iC, 
in this order by straight line segments. Choosing T by the requirement T = exp(A(T) log x) 
we are led to our main theorem. (Note that M(x) can sometimes be so small as the error 
term itself, for example when F(s) = (¢(s))7!®) 

In our final choice of F(s), we can improve the error estimate in our main theorem by 
assuming Riemann hypothesis and its generalisations to L-functions involved in F(s). (See 
Remark 3 below). For unconditional results we have to depend on the deep estimate 


a (1)? A 
o(s) — ae ((|t| + 10) log(|t| + 10)) (4) 
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where (4 <o < 1)A > Oanda > 1 are constants. This witha = 3 is a deep estimate 
(see [E.C.T], [A.I], [A.A.K; S.M.V] and [K.R, A.S],) due to the Soviet Mathematician 
I.M.VINOGRADOV. (It is well-known that a = | 1s trivial and leads to A(T) with 
(A(T))~! = O(log T) and hence to 


x d 
(Xx) (- > ) = | — + O(x(exp(/logx))~") (5) 


oa log 


where 2 > O is a constant which is unimportant). (The symbol = denotes a definition). 
But the merit of VINOGARDOV’S estimate is that it leads to the A(T) with (A(T))7! = 


O ((log T)3 (log log T)3) (see Remark | below) and hence to the O-term 


O(x(exp((log x)¥ (log log x)~3))~*) (6) 


in (5). It must be mentioned that any constant > 5 in place of 2 in (6) 1s a very very 
important result from many points of view. 
Although the estimate (4) is not known (nor likely to be known for many centuries to 


come) fora > 3, we can assume (4) for any a > I, and arrive at the O-term 


—! 


O(x(exp((log x) aT (log log x)7 aay“) (7) 
in (5). 


Remark 1 The only method of getting the best known unconditional A(T) seems to be 
via (4). The best A(T) follows from (4) by an easy function-theoretic device due to 
J. HADAMARD, de-la-VALLEE POUSSIN and E. LANDAU. (For the relation between 
a and A(T) namely (A(T))~! = O(log T)i (log log T)!-z) see the booklet [K.R]j 
by K. RAMACHANDRA). Instead of the assumption (4) we can assume a A(T) as a 
hypothesis and proceed and get the corresponding error estimate in the main result. (A 
more illuminating method of getting A(T) from (4) is due to H.L. MONTGOMERY and 
later to K. RAMACHANDRA [K.R]s5 where he proves a slight refinement on the result of 
H.L. MONTGOMERY). 


Remark 2 The inequality (4) has been extended suitably (by a few mathematicians notably 
by T. MITSUI [T.M]) to cover zeta and L-functions of algebraic number fields (witha = ; 


3 
2 
and A > 0 depending on the L-functions). The corresponding A(T) satisfies (A(T))~! = 


O((log T)i (log log Pa) where the O-constant depends on the L-function. This leads 
to the error estimate (7) in our main result which is fairly general. 


Remark 3 If we assume the analogue of Riemann hypothesis for all the L-functions occur- 
ing in our final choice of F(s), then we can take A(T) = 5 — Bo(log log T)~!, where 
Bo is a positive constant depending on F(s). Using some results of J.E. LITTLEWOOD; 


A. SELBERG; K. RAMACHANDRA and A. SANKARANARAYANAN (for reference see 
the paper [K.R; A.S]2 by the last two authors) we can obtain the error estimate O (x pte), 
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where (x) > O and w(x) = O((log log x)~!). For some choices of F(s) we can get 
better error terms. For example if F(s) = )° p~* (prunning over all primes) we have (on 


Riemann hypothesis namely the analyticity of this F(s) ino > 5, t> 1), 


ns) (= Eij-f ie 4+ O(x? log x). 


log u 


§ 2 First Step 


Let O < Ay < Ax <-:-- and let A, — oo and further let for all x > 100 and for a certain 
constant 6 > 0, 


Y> lanl = Ox! F exp((log x)*)) (8) 


lAn —x|<x1!-6 


where € > OQ is arbitrary and the O-constant depends (as usual) only on € (Note that (8) 
implies that 


Y> lanl = OC exp((log x)*)) 


lx—Ay|<J 


for all J with x!-° < J < x). Then we have 
Theorem 1 Let E(T) be defined by (1)*. Then subject to (8), we have 


E(T) = O(xT~! exp((log x)*)), (9) 


provided x'~° <7<x,tel<T< x, 


Remark: This is not very different from what is called usually as PERRON’S FORMULA. 


Proof: Under the condition (8), the series for F(s) is absolutely convergent ino > 1 and 
so, ono =C. Hence LHS of (1) is 


C+il 5 ds OO x 
as nf{—,.C,T ), say. 10 
vif, (&) F-Lar(Ecr). co 


Now for all y > 0 we have the following well-known result: 


f(y, C,T) = 6(y) + O(y|T log y|7'), vy A 1, (11) 


where 6(y) = O or | according as y < l or y > 1. (This result is well-known and can 
be established by moving the line of integration too = oo if y < landtoo = —owif 
y > 1). Alsoif y = I then it is easy to see that f(y, C, T) = s+ O(T~'). It is sometimes 
convenient to use 

f(y, C,T) = 8(y) + OO) (12) 


which can be established by moving the line of integration to o = T if y < 1 and to 
o = —T if y > 1. In(12) wecan take 6(y) = O or | according as y < lory > 1. 
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Thus LHS of (8) is equal to 


yaro(e+y+y] (13) 


AnSx ] 2 3 
where 
Cc —] 
= LX tnt (HE) fr tos Hf 
I An <5x 
x C —1| 
= > Ian! T log = ; 
Apex 
and 


zl a), 


and the last infinite series is, (by(8)), 


5- X<dy, <2x 


i) 
25% > oxi min ( 
. 


Trivially )°, + }0,; =O (xT! > plana 


2 4 
O (22texrcoe 2)‘ )y2i-© + D exp((log 4)*))41~© + .. ) 


n=] n=2 


OO 
= O (die né)2—"(log ” 


n=] 


= O > + the rest | = O(exp((log x)**)) 
n<100(log x)! 
To tackle >), we use | log oe = O(x|x —A,|7') for |x — An| > xT7!, (The rest is 


trivially, by (8), O(x ‘a exp((log x)*)). We break this up into 
NT 2 le hy) 2" xT (m= 1,2; 3,20); 


and by an easy calculation we end up with Theorem 1, (since € > O is arbitrary). 
Another method of proving theorem 1 is to apply a powerful theorem of 
K. RAMACHANDRA (see [K.R]2). O 


§ 3 Step 2 


Now we specialise F'(s) further as follows. Let us consider the set of all L-functions 
of all possible number fields. (Mathematicians not familiar with number fields and their 
L-functions can ignore them and confine to the case €(s)). Let So be a fixed finite subset 
of this set. Let S; be a power product (with complex constants as exponents) of functions 
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in So. Let Sz be a power product (with non negative integer constants as exponents) of 
functions which are logarithms (in o > 1) of functions in Sp. Let $3 be a power product 
(with non negative integer constants as exponents) of all possible derivatives (with bounded 
order) of functions of So. Finally let f(s) = )~°°., b,n~* be a Dirichlet series for which 
> 1 ania“ < oo for some constant wo < 1. Then our function F(s) is defined by 


F(s) = S; 52 83 f(s) (o > I) (14) 


and its analytic continuations in (0 > jg) along all possible paths in the complex s-plane 
with certain lines A removed. (A ccnsists of all lines, parallel to the real axis, which 
contain a singularity of L-functions in S;, S2 and $3). We select that branch for which 
log L(s) ~ Oaso > &. 

We now make the following hypothesis and proceed. 


Hypothesis: L(s) 4 0 in (o > 1—AX(T), |t| < T) where (A(T))~! = O((logT)a 
(log log Ty 2) for some constant a > 1. 


Lemma 1 /n (o > 1— X(T), |t| < T), we have L(s) = O(T“) where d is the degree 
of the number field, provided L(s) ts formed with the non-principal character. If L(s) is 
formed with the principal character then |L(s) — Ko(s — ly} = O(T“), where Ko isa 
positive constant. 


Remark: In the main term M(x) defined by (3) the constant Co is not very important. 
Hence we can confine our attention to sufficiently large values of |f| (i.e. to |t| exceeding 
a large constant at our choice). Moreover plainly we see that it sufficies to consider large 
values of ¢(that of —t is covered by complex conjugation). In this case the lemma is 
well-known. 


Lemma 2 (BOREL-CARATHEODORY). Suppose f(z) is analytic in |z — zo| < Rand 
on the circle z = zo + Re'"(O < 6 < 2m) wehave Re f(z) < U. Thenin|z—z0| <r < R, 


we have 
2r(U — Ref (Z0)) 


F(Z) — Fol Ss ee 


Proof: See Theorem 1.6.1 of [K.R]3 (This is a well-known Theorem and we have stated 
it with the notation adopted therein). C 


Lemma 3 We have, in(o > 1 — 5K(T), Co < |t| < T), the estimate 
log L(s) = O((log T)*’). 
Also log L(s) = O(1) ino > 1+ €, where the O-constant depends one. 


Proof: The second part is trivial by looking at the (absolutely convergent) Dirichlet series 
for log L(s). To prove the first part we apply lemma 2 with f(z) = log L(z), zo = 
2+it, R=1+A(T), r=1+ SA(T). By lemma | we can take U to be (log T)*. This 
gives the first part. CO 
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Lemma 4 Foralle(O < € < 5) we have in(o > 1—e€A(T), Co < |t| < T) the estimate 
log L(s) = O((log T)*) (15) 

where the O-constant depends only on e. 

Proof: We apply maximum modulus principle. Let so belong to the rectangle in question. 

Put s9 = 00 tito, G(s) = e&—~)2 X5—S0 log L(s). (We can assume that og < 1+ € since 


otherwise we are through). 
Then |G(so)| < max |G(s)| taken over the sides of the rectangle 


l 
(1 - ro = log nl 5MT) so <1 +e), 


Note that e950)” — O(e~¢-)"), Lemma follows by a proper choice of X. O 
Theorem 2 Subject to our choice of F(s) as stated in the beginning of this section, we have 


Yo an = M(x) + O(x(exp((log x)7T (log log x) t))~H), (16) 


nsx 


Proof: In (3) we have only to replace A(T) by €A(7) and use Lemma 4 and choose T 
suitably. C 


Remark: We remark that log ¢(o +it) = O(log log t)in(o > 1, t > 100) and asimilar 
result for log L(o + it). These are not hard to prove. 


§ 4 Discussion of M(x) 
Note that M(x) depends on T. We will now see that 
M(x) = Mo(x) + O(x(exp((log x)@T (log log x)~1))~"*) 


where Mo(x) is a certain contour integral. Since L(1 + it) 4 0 and &&(s)(s — 1) # O for 
s = 1 + it(&,(s) denotes the zeta-function of the number field k), it follows that log L(s) 
and log(¢x(s)(s — 1)) are regular in (o > | — 6, |t| < Co) (for some constant 0 < 6 < 1) 
and are bounded in (o > | — 56, \t| < Co). Hence if we write 


l —Bo 
H(s) = F(s)(s — 1) (tog =a -) 


and choose the complex constant ao and the non-negative integer constant Bo, then H(s) 
is regular in (o > 1 — 56, |t| < Co) and is bounded there. Hence we can replace the 
contour (in M(x)) namely the one obtained by joining by straight line segments the points 
1—A(T)—iCop, C—iCo, C+iCo, |—A(T)+iCo by the new contour obtained by joining 


l—A(T) —iCo, 1-36 -iCo, 1 — 56 —i0, 1 -— 56 +10, 1—58+iCo, 1—A(T) +iCo. 
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Here we mean by straight lines except the ones with a bar below where we mean the circle 
D with centre | and radius 55 with the point 1 — 55 removed. Apart from the circular 
contour D the contribution of the rest of the contour is O(x!~*“)) since H(s) = O(1) 


and 
1 Bo 
) = O(1). 
s—l 


(s —1)°% (toe 
Thus 
M(x) = Mo(x) + O(x(exp((log x) (log log x)7##1))-H) 
where (17) 
Mo(x) = x5 fp F(s)* ds. 


Now H(s) being analytic in the disc |s — 1| < 6, we have 


H(S) ae ayn 
: =) a 1) 


n=0 


where 
F l H(s) ds 


on Oni Ips (8 —- Dt! 5 


where D* is the disc |s — 1| = 6 and soa; = O(6~"). Thus in |s — 1| < 46, we have 


N 
sH(s) = >» a*(s~1)" +R, 
n=0 


where 
OO | n 
R=O eae (me —- 0(2~%),. 
» #*(58) J=0e™ 
n=N+1 


Finally we arrive at the asymptotic expansion 


1 1 Bo 
Mo(x) = > a;, (a [io — 1)" (102 am :) vas) 


O<n<N 


+ 0(27-" x(log x)!°). 


where /o is a real constant. This can be seen by deforming contour D( for n < N) suitably. 
Here the integral on the right is 


I n—-@ 1 ‘“ AY 
— | (s—1) “ {log x'ds 
20 i D s—l 


Ib 


1 
eee (s — 1)" (108 


QTV Jaws 


] Po 1 
:) x8ds + O(x!~28 (log x)5-7%). 
S —_— 
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The integral —oo, 1+ is the line joining —oo — i0 to 1 — i0 and this to 1 + i0 by 


an indentation at s = 1 and then 1 + i0 to —co + i0. The last integral is (with an 
obvious notation). 
0+ Bo 
x l 
— cue (tog -) x*ds 
2i Joo 


Thus we are led to study (in the first instance) 


1 0+ 
: | s “”xSds 
5s 5 ae ere 


l 0+ 
= (log x)”7! al w Me” dw 
QT yee 


(by the substitution s = w(log x)~!), where yo is a complex constant. 
Now 


1 0+ 
Jon) = = fw Med 
2H) iss 
is analytic in yo and it is 


| i l Pitas, a 
—— | (ue '™) Me “du t+ = | (ue'") “e "du 
270i 0 271 0 
(on splitting the path of integration into (—oo, 0) and (0, —0oo) respectively), 


1 sin(z yo) 


I . 1 
ad (2i sin(z yo)) P(1 — yo) = (yo) — yo) = —— 
1 IT 


l(yo) l(yo) 


and this gives the expression for J(yo) in terms of the familiar function '(z). Also for any 
complex constant yo we have 


1 0+ 1\o | 
— sm (toe =) x*ds 
ZN J 263 S 


qo ye 
= ——-((log x)" J (z) 
d an zZ=V0 


a ( (log —)) 
= a zo T(z) mah 


This completes the discussion of M(x). We have still to check the condition (8). For this 
observe that the coefficients of the Dirichlet series for F (s) are majorised by those of 


dg (n)(log n)427\ (SA bn! 
aoc eae |, 


n=2 n=l 
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for some integer constant g > 0. Now 


Y> [bnl(dg(m)(2 log m)? + 1) 
X<mn<X+H 


—O SY [Bnldg(m) } (log X)*4) 
X<mn<X+H 


and the last sum is 


> [Dn > dg(m) _ y+ 2 say, 
sns H ] 2 


where 9°) = Dop<n<xn -»- (and) Dy = Doxn encx ---and 0 < n < Lis at our choice. 
Since dg(n) = O¢.qg(n*) we have 


Leet Se (= +1) ibnix* 


2 X"<n<X 


one) 
€ —(l-—po)n lbn| € 
ON ERK AHN yo Px De, Meal 
n=1 n<X 


where the last sum is < X* Oonex lb, |n—~49)X#0 = O(xHot€) , and the sum previous to 
itis O(HX—U— Hoty, 
To treat )°, we use 


Yo de(n) = x Pei (log x) + OCF (log x)~*) 

nox 
where k > OQ is an integer constant, x > 21s a variable and Py (log x) 1s apolynominal in 
log x of degree k — 1. (For this result see equation (12.1.4) on page 313 of [E.C.T]). Now 
on using this result, 


> =0 


(provided H = x!-9 and 8 and n are small constants). Condition (8) now follows. Thus 
we have the following theorem (by choosing N = (logx)®” where B*(0 < B* < 1) isan 
arbitrary positive constant). 


S> Ibn (dog x") = O( Hog X)%), 


n<X" 


Theorem 3 (MAIN THEOREM) Subject only to the choice of F(s) as in the beginning 
of §3, we have 


Yo ay = xM*(x) + Oe(exp((log x)77 (log log x)" t))™"), (18) 


An <X 
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where M* (x) has the asymptotic expansion 


OO 
M*(x) = > — A,(log x)7"* SY" By (log log x), (19) 
v=] O<k<fy 


where j is an integer constant, By > 0 is an integer constant, and A, and By = B,(v) are 
complex constants and yo is a complex constant. 


Remark 1 It must be remembered that Theorem 3 is subject to the hypothesis in §3 made 
after equation (14). Theorem 3 is unconditional only ifa = 5. The hypothesis just refered to 
witha = 3 is adeep theorem established by the Soviet mathematician ILM. VINOGRADOV. 
It may take many centuries to prove the hypothesis with some a > 3. 
Remark 2 It may happen that many A, are zero in the beginning so as to make x M*(x) 
as small as the O-term (for example when F(s) = (c(s))7'). If F (s) has a pole at s = 1 
even then the treatment is simpler. 


§ 5 An Alternative Method 


Consider F(s) = )-°2., ayAZ* where an > 0 and0 < A) < Az <... and A, —> oo. Then 
it is possible to get rid of the condition (8) in the following way. Put A(x) = are <y 4n and 
B(x) = fo A(u)du = oy <, An(x — An). It is much simpler to deal with B(x). Suppose 
that B(x) = fo M(w)du+O(x?A) whereO < A < j with M(u) = 5 fx, F(s)us&, Ky 
being a suitable contour. 

Then we have the following Tauberian Lemma. 


A Tauberian Lemma: We have 


A(x) = M(x)+ O (ra! ( + aloe we) 
y7 SE SLK 


Proof: Let 0 < 6 < 5. Then 


x+6x 
B(x + 6x) — B(x) = / A(u)du > dx A(x). 


X 


Here the LHS its 


x<i< 3s 


5x [ar +O («: max we)| + O(x7A). 


Thus 
A(x) < M(x) +0 («: max we) + O(xAS7!). 


3x 
ASG Sy 
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Similarly we have 6x A(x) > B(x) — B(x — 6x) = (i M(u)du + O(x*A) 


x—dX 
= 0% {400 +O ( max we)| + O(x7A) and so 
5 SE Sx 


A(x) > M(x)+ 0 (5, max |M © +- O(xAd7'). 


<FE< 
s< a ¢ 


Combining both we have 


A(x) = M(x)+ 0 («5 en |M’ oy +- O(x A857!) 
3 


the choice 6 = A2 gives the lemma. OJ 


Remark 1 In many cases it is possible to prove (without bringing in condition (8)) that 
3 

B(x) = i M(u)du + O(x*A) where A = O((exp((log x)3 (log log x)~5))~#) and 

M'(u) = O((log(|u| + 10))&) where g > O is an integer constant. Thus for example if 


F(s) = —¢'(s)(€(s))~! wecan recover the prime number theorem with the VINOGRADOV 
error term. 


Remark 2 Consider F(s) = yore (p(n))-S, where y(n) is the Euler’s totient function. 
In this case we can apply the alternative method. But we have only to be contented with 
the best known (unconditional) result 


l 
» 1 = dox + O(x(exp log x log log x)~") (4 = (: + ——)) ; 


ee p(p— lt) 


due to P.T. BATEMAN (see [P.T.B]). Even the assumption of Riemann hypothesis that 
f(s) # Oino > 4 does not seem to result in any improvement over this unconditional 
result. It may be noted that for this special choice of F'(s) we have 


I 
F — | ieens —_— _——_ 
(s) = C(s)x (1-- — (1 :). 


Thus F(s) and €(s) have the same zeros ino > o (since the infinite product is regular in 
o > Oand does not vanish there). Note that the infinite product is 


l 
exp(O(t!~ ° log log t))) (+ = > 100, in <oa< ' 


and Bateman’s result follows from this and the trivial estimate €(s) = O(t!~° logr)(t > 
10,5 <o <1). 
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§ 6 Squarefree Numbers and other Problems 


We now proceed to explain a method due to H.L. MONTGOMERY and R.C. VAUGHAN 
(see [H.L.M; R.C.V]). For instance we can apply this method to estimate things like 
nek |u(n)| — Sx (they devised this method in connection with this problem on the 
assumption of RH (Riemann hypothesis)), and 


2S p(n)d(m) — Ax log x — Bx (for suitable constants A, B) 


mn2 <x 


where j2(1) and d(n) are defined by (¢(s))~! = )* u(n)n~ and (C(s))* = yp A(nyn 
(The best known unconditional results are due to A. WALFISZ [A.W]. His method gives 
O(x ste ), on RH, for the first problem. However the method of A. WALFISZ was discov- 
ered first by A. AXER [A.A] who assumed RH and proved that De ee ug(n) = (C(k)7 x + 
O(x2+0/@2K+)D) where C(s)(C(ks))7! = aan Lx(n)n~*,k > 1 being an integer con- 
stant. AXER assumed that )0-,u(n) = O(x2**) a fact which was shown by 
J.E. LITTLEWOOD [J.E.L.] to follow from RH). The method of H.L. MONTGOMERY and 
R.C. VAUGHAN can also be applied to things like )/,,,2<, U(m)r(n)(r(m) being defined 
by C(s)\(1— t: + + —+--+) = >°°%°  r(n)n~*) and also to Dy omn2<x H-(n)d3(n)(d3(n) being 
defined by (¢(s))? = Des d3(n)"~*). We will remark about some other applications at the 
end of this section. (It is good to mention here the results Denes d(n) = x log x+y —1) 


x + O (x73 (log x) 1%) 
315 
yo r(n) = 1X + O(x 73 (log x) 1 ) 


Nn<x 


both due to M.N. HUXLEY [M.N.H] and the result >), 43(”) = x P2(logx)+ O(x 6 +) 
(P2(log x) 1s a polynomial in log x of degree 2) due to G. KOLESNIK [G.K]). We begin 
with £(s)(¢(ks))~! = p Pras [Ux(n)n~* where k > 2 is an integer constant uz(n) = 1 if 
n is k power free and zero otherwise. When k = 2 this is the characteristic function of 
square-free numbers. We have 


Dove) = DF nM) 


nsx nkm<x 


| 
B~ 
~~ 
= 
~— 
a | 
Po) 
| ey 
| 
WE 
_ 
a 
= 
el 
fo Ss 
a | 
“ 
a 
| 


x = p(n) 
nk ) > nk 


n=1 
Now Eo = Dincy + Duensy = E1 + E2 say. Clearly E; = O(Y) and 


Note that El and ar are Monotonic. 
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Thus assuming 


Y > u(n) = O(x(Ep(x))~!) where Eo(x) = exp(y (log x)5 (log log x)~5) 


n<x 


(4 > 0 is some absolute constant), we have Ey = O(x »u>y U'-k(E (U))~'), (U runs 
over numbers Y2":n = 0,1,2,...) < (Eo(Y))7!x »-u>y U'-* < (Eo(Y))7!xy!-*. 
Hence 
Eo = O(¥ + (Eo(¥))7'x¥'™). 


Choose Y by 
¥ = x? (exp((200)~!wk-!(log ¥)$ (loglog ¥)~3))~! 
so that 
Sek a ee : a =| 
(Eo(¥))7!x¥!-* = xk (exp((200)~'k~! (log Y)5 (log log ¥)~ 3)*-!(Eo(¥)) 
| —| 
< xt («xp (5 H0oe Y)3 (log log ry-4)) ; 
Also from the equation defining Y we have (for fixed k, as x — oo) 
log Y lo 
i ie x 
g k g 
and so 


O(¥Y +xY!“*(Eq(Y))~') 
O(x* (exp(uik7 § (log x)3 (log log x)~3))7!) 


Eo 


where {41 > 0 is an absolute constant. Thus we have the following theorem. 


Theorem 4 (A. WALFISZ). Let k > 2 be an integer constant and let x(n) 
not divisible by the k-th power of any prime and zero otherwise. Then for all x 
we have 


lifn 
Xo(k), 


IV Il 


S- we(n) = a + O(xE (exp(k~ 5 (log x)3 (log log. x)~3))7"). 


Nn<x 
where 41 > Ois an absolute constant. 


H.L. MONTGOMERY and R.C. VAUGHAN assumed RH and treated the term E> (more 
effectively than A. AXER) and arrived at the error estimate O(x'+©/&+))_ (In fact they 


treated FE, also more effectively and got even better results on RH). Their method (see 
[H.L.M; R.C.V]), is embodied in the following lemma. 
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Lemma: (H.L. MONTGOMERY and R.C. VAUGHAN). Let Y be bounded above by 
a fixed positive power of x. Then for every € > O and all x > xo(€,k), we have 


= 46 CAK/2 
y= Yue (Le 7|- =)= O(xé 4x24 ). 


Proof: We have, by Perron’s formula (for 20 < T < x) 


l+e+iT s 
Dum | :|= si) aoe pie ds 


n>Y It+e—iT n>Y 


Denoting the integral on the right by I and the sum over n by ¢y (ks), we have 


| Leer 


yit2e 
[2—— c(s)ty (ks) ds +) Hn) (= :) ed ( T +) 


l oa 
271 ste-iT* n<Y 


where 7* lies between 4 and 2T (since Cy (ks) = O(1) and ft. 7 \o(s)ldt = O(T logT) 


for o = 5): Thus 


l : 
x x | xtet+iT* x l+2¢€ 
> un) (| ]- =) = =a | c(s)tv (ks) ds +0 (“= +37) | 
ney nh n 20 i 5+e-iT* T 
We now assume RH and show that on (o = 5 + ¢,|t| < T) we have 

ty (ks) a O(Y~3+3+36 73), 


Let V > 10and let U runover the numbers Y2”"(n = 0, 1, 2,...). Then by Perron’s formula 
again, we have (with s = 5 +€+ it) 


| 1+i1V dW 
eaten ke wy (UO ea) 
oni Shay (€(ks + W)) ( ) W 
U'+e i 
De wimym—* +0 ( 7 )+ow- 
U<m<2U 


We use (f(5 +et+ it)! = Oc 5((\t| + 2)°) for all € > 0,5 > O (this ue is due to 
J.E. LITTLEWOOD [J.E.L]) and move the line of integration to Re W = —5 ae 5 — €and 
take rough estimates. Thus we obtain 


Y> (mm = O (versus + ee + u-4) 
U<m<2U “ 
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We next choose V to be a large power of U and sum with respect to U. Thus we obtain the 
lemma (since € > OQ is an arbitrary constant.) C 


From the lemma it follows that 


ss X 
= Fa ne + O(Y 4x2¥272)(4¥)) 


choosing Y = x*+!I we get the MONTGOMERY-VAUGHAN Theorem namely 


Theorem 5 (H.L. MONTGOMERY and R.C. VAUGHAN [H.L.M., R.C.V]). Letk > 2 
be any integer constant. Assume RH. Let Qx(x) be the number of k-free numbers n < x. 
Then 


Ox(x) = rap t Ot e(xtH*®) 


holds for every € > 0. 


From the proof it is plain that we can make the following generalisation. 


Theorem 6 Let G(s) = ae B,n~* (where By, is any sequence of complex numbers) 


converge absolutely ino > 1. Suppose that G(s) is regular ino > except for a finite 
number of poles and that for alle > O and é > 0 we have 


ff | a(5 +e +it) 


For x > O, let P(x) denote the sum of the residues of x'G(s)(t(sk))~!s—!at the poles 
(ino > 5) and that 


dt == O-.3(T°). 


> _ Bn — P(x) = O(x”) 
nsx 
l—-a@ 
for some positive constant a < 1 Further let Bn = Oc(Xx*) where X = x*+1-%e for 


k 
every € > O. Then, we have 


2) HAM) Ba = Y mom? (S z) + O<(Xx*), 


m kn<x 
for every e > 0. 


Remark 1 Instead of ¢(ks) onecan take any finite power product of ¢ (ks) and its derivatives 
(with integer constants as exponents) which is regular at s = t We can do the same thing 


with L(ks, x) provided we assume that L(s, x) 4 Oino > 5). A curious application of 
the MONTGOMERY- VAUGHAN method gives (of course on RH), 


I 
> loge D> ux(n) = (can YEP) x + eet, 


PSX n<xp7h 
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where k > 2 is any integer constant and jz; (”) is one or zero according as n is k-th power 
free or not. In the last sum above p runs over all primes. 


Remark 2 We mention that we can impose one ~ Bn = O<¢(x‘). Then the above mean- 
value condition in (T, 27) is automatically satisfied (we leave this as an exercise to the 
reader) and trivially ~@ = e€ for every « > 0. Hence in the last statement of the theorem the 


error term is O(x@+9U+0""'). Tf & = 2 or 3 we can take G(s) = £2(s) since a > 3 If 
k = 2 wecan take G(s) = ¢3(s) sincea > ae However we cannot take G(s) = C4(s) 
and prove the error term to be O(x 3-5) for some 6 > O. It would be interesting to prove 
this since “ss 
4 
f(s) = > d?(n)n~, 
cQs)  & 


an identity due to S. Ramanyjan. 
Remark 3 In §1 to §5 we have borrowed freely from the papers [R.B; K.R] and [K.R]q 


although we do not refer to them in the text. Theorem 6 and Remarks | and 2 above, though 
new, are essentially due to H.L. MONTGOMERY and R.C. VAUGHAN. 


Remark 4 We hope to publish a paper II with the same title where we study the oscillation 
of the error terms in the prime number theorem in a more general set up. 
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Appendix 


We begin by recalling Grim’s conjecture (for reference to Grim’s paper see the two important 
papers [A-1] and [A-2] below, on Grim’s conjecture by K. RAMACHANDRA, 
T.N. SHOREY and R. TISDEMAN). Of course the conjecture is very much open today. 


GRIM’S CONJECTURE (G.C): Letn > 1 andn+1,n+2,....n +r be r(=> 1) 


consecutive integers with the property that none of them is prime. Then their product ts 
divisible by at least r distinct primes. 


Remark: Note for example that (n+ 1)...(2n) can be divisible by at most O(n(log n)~ I) 
primes. 

Now we come to the point. The object of this appendix is to see that a remark of Professor 
PAUL ERDOS (written in a letter to the author) does not get lost to the mathematical 
community. ERDOS showed in the letter that G.C. implies that for all sufficiently large 
positive integers N there is a prime between N* and (N + 1)*. The following theorem is 
essentially due to Professor PAUL ERDOS. 


Theorem: Let r > 2 and n > 2 be integers and C any positive constant. Put R = 
r— [Cr(logr)~'] and assume that R > 2. Then the product ofn+1,n+2,...,n+,7r Is 
divisible by at most R distinct primes, provided n exceeds a large positive constant and r 


. e e e l — i 
exceeds a certain large positive constant times n2 (logn) 2. 


Corollary: The conjecture G.C. implies that for all N > No, there is a prime between 
N? and (N + 1)*. In fact, if Py denotes the n-th prime, G.C. implies Pna1 — Pn = 
1 


O(p2 (logn)~?). 


Remark: The deduction of the corollary is easy and hence left to the reader. 
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Proof: Assume that the product ofn + 1,...,2+-7r is divisible by at least R + 1 distinct 
primes. Note that the integer. 


a AN gs OPE) 
7 r| 


= 
(n+r)’ | Vain ten’ (1 +O (-))| 


which is < (4*£)"C{ where C; > 0 is some constant. 
Denote by pj,..., p; all the primes dividing r!. Then 


n-+r 
Ipipas---. Pi  ( : Jos 


where C2 > 0 is a constant. (Here we have used the Chebyshev’s result ) > p<x 0g Pp = 
O(x)). Hence (using 1 + x < e* for x > 0). 
We have 


I 


does not exceed 


n\r ry Con\" r2 


Here by our assumption (using Chebyshev’s result 2(x) = O(isex? or equivalently the 
n-th prime exceeds a positive constant times n logn for n > 2) 


[p\,---, Pj = Pilo---s PR41 = MWa<k<R(C3k log k) 
= C3(R))m2<rer log k, 
where C3 > 0 is aconstant. It is not hard to see (since log logk for k > 3 1s increasing) 


that the last expression is > Cyr" (log r)’, where C4 > 0 is some constant. Thus finally we 
end up with the inequality 


. 
(Car logr)’ < (Conr™! exp -) 
n 


2 r 
r°logr=O (nexp ~) 
n 


The choice r = [Csn2 (log n)~ 2 ], where C5 > O is a large constant leads to a contradic- 
tion which proves the theorem. C 


Remark: We have also the following stronger conjecture due to Grim: If n > 1 and 
r > | are integers andn+1,n+2,...,n+,r are all composite there exist distinct primes 
q4i,---»Qr Such that g, divides n +k fork = 1,2,...,r. But this does not seem to give 


1 
more information than py} — Pn = O(p-¢ (log n)~2). 
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Sums of Squares: An Elementary Method 


R.A. Rankin 


1 Introduction 


If x1, x2,..., Xs are integers positive negative or zero such that 


xe pe tee. txt =n, 


then (x1, .x2,...,Xys) is called a representation of n as a sum of s squares, and the total 
number of representations is denoted by R,(n). Two representations (x1, x2,..., Xs) and 
(y1, y2,..-, ys) are considered to be different unless 

x| = yi, X2 = Y25-++,Xs5 — Ys. 


Further, using a notation introduced by J.W.L. Glaisher, we write Ry,g(n) for the number 
of representations of n as asum of squares of which @ are odd and £ are even, no restriction 
being placed upon the order of the squares. Observe that R;(0) = 1, and that Ryg(0) = 1, 
if « = 0, but that otherwise Ry ,(0) = 0. 

Mathematicians have been interested in evaluating R,(1) and Rgg(n) since the 17th 
century and earlier, as a study of Chapters 6—9 of [3] will confirm and various methods have 
been used. Nowadays the use of modular forms provides the simplest answer. In the case 
of R;(n) the problem boils down to expressing the s-th power of the theta function 03 as an 
Eisenstein series plus a cusp form, both being holomorphic modular forms of weight s/2. 
The problem is simpler when s is even, as we shall generally assume, mainly because no 
complicated multiplier systems are then involved. See, for example, §7.4 of [10]. 

In a series of papers in the old Quarterly Journal (summarised in [4]) Glaisher obtained 
formulae for R2s(n) and Rq.g(n) for s = 1,2,...,9 anda + B = 2s by means of elliptic 
function equations. His method, now regarded as old-fashioned, was essentially based on the 
consideration of various power series involving 13 and the Jacobi functions k and k’. Among 
these there appeared a number of what we now call cusp forms, whose Fourier coefficients 
had interesting multiplicative, or partly multiplicative, properties; see [8]. Where these 
cusp forms were absent, or had a zero Fourier coefficient, R2s(n) or Rg, p(n) was expressed 
in terms of divisor functions of different types. Among Glaisher’s formulae the only 
ones which do not involve arithmetical functions other than divisor functions are formulae 
(1)-(24) given below. The divisor functions are defined in equations (25) and (26). Then 
we have 

R2(n) = 4Eo(n), (1) 


Ra(n) = 8{2 + (-1)"}Ai(n), (2) 
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Ro(n) = 16E3(n) — 4E2(n), (3) 
RAG) = | ey — 15A3(n)} 3 (4) 
If N = 2°N,, where a > O and N; = 3 (mod 4), 
Rio(N) = =(Ea(N) ~ 16E4(N)), (5) 
Rj2(n) = = (21As(2n) + 10A5(2n)}, (6) 
R2,0(8n + 2) = 4Eo(4n + 1), (7) 
R,,;(4n + 1) = 4Eo(4n + 1), (8) 
R40(8n + 4) = 16A;(2n + 1), (9) 
R3.1(4n + 3) = 8A) (4n + 3), (10) 
R2.2(4n + 2) = 24A,(2n + 1), (11) 
R,.3(4n + 1) = 8A) (4n + 1), (12) 
Ro.o(8n + 6) = —8E2(4n + 3), (13) 
R4.2(4n) = 240E4(n), (14) 
R3,3(4n + 3) = —20E2(4n + 3), (15) 
R2.4(4n + 2) = 60E5(2n + 1), (16) 
Rg.o(8n) = 256A5(n), (17) 
R4.4(4n) = 1120A4(n). (18) 


If N and N;, are as before equation (5), then 


Rg 2(4N) = 576E,(N), (19) 
Ro.4(2N)) = 168E,(N), (20) 
Ra.6(4N) = 2688E;,(N), (21) 
R2.3(2N1) = 36E4(M1), (22) 
Rs.4(8n) = 3960A4(2n), (23) 
R4.3(8n) = 3960A5(2n). (24) 


In formulae (7)-(18), Ra.g(m) = 0 unless m is of the form stated. This is obvious since, 
if x is odd, then x* = 1 (mod 8), while, otherwise, we have x? = 0(mod 4). The functions 
A,(n), Ai,(n), Ey(n) and E,(n) are defined as follows. Let 6 be an odd divisor of n and 
write 66’ = n. Also let x(n) be the character defined by 


x(n) =O(neven), x(n) = (-1)"- DP (yn odd). 
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Then 
Ay(n) = 98", AL(n) = 906", (25) 
b|n é|n 
E,(n) = D> x(8)5", E,(n) = D> x (5)6)", (26) 
5|n b|n 


Note that formulae for Ro.25(4n) are omitted since Ro.25(4n) = Ros (n). 


2 Historical Remarks 


Some of these formulae have, over the years, been proved by elementary methods not 
involving elliptic or modular functions, by mathematicians such as Pepin, Dirichlet and 
Uspensky. The most complete elementary account so far published is Helmut Bessel’s 
doctoral dissertation submitted to the University of K6nigsberg [2] in 1929. He bases his 
work on three formulae of Liouville in articles in the Journal de Mathématiques between 
the years 1840 and 1850, with the title ‘Sur quelques formules générales qui peuvent étre 
utiles dans la théorie des nombres’. See Chapter 11 of [3] or pp. 365-463 of [1]. I give the 
following as an example: 


I f i 
DUf& +9 - fa-Wh= 5) 8FO — FO), 


b|n 


where f is an odd function and the summation is over all solutions of the equationax+by = 
2n(a, b,x, y odd). 

By means of these formulae Bessel proves the majority of the results stated above, 
excluding formulae (19)-(24). He also makes use of certain cusp form coefficients such as 
CG, — 3x?x3), the summation, in this case, being taken over all solutions of the equation 
os + i =n. This is a multiple of Glaisher’s multiplicative function x4(n), which is the 
Fourier coefficient of a cusp form of weight 5, and arises in the study of 10 squares. 

I completed the work of the present article in July 1944 but it was not until October 1947 
that I managed to borrow a copy of Bessel’s thesis, through Inter Library Loan. I have now, 
many years later, succeeded in obtaining a photocopy of his thesis through the courtesy of 
the Mathematics Library of the University of Illinois at Urbana. 

In what follows I assume (1) and deduce the remaining formulae from it. A proof of (1) 
can be found in various treatises on number theory, for example in Chapter 2 of [5] or §6.7 
of [7]. An interesting account of the history of the subject will be found in Chapter 9 of [6] 
and the notes to that chapter. Finally, a good account of Liouville’s methods is in Chapter 13 
of [11] and in [1]. 

In my treatment of the subject I do not use any of Liouville’s formulae, but base my 
account on the elementary Theorem | below. It may be mentioned that this work has 
already been used in my paper [9]. Inevitably there are certain similarities between my 
method and Bessel’s, but the two methods are not identical. 
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3 General Results 


All letters, with the exception of 7, denote integers; in particular x, y,a, b,c, &, n,n, d and 
6 denote non-negative integers, and will usually be positive. We write d for any divisor of 
n and 6 for an odd divisor of n, and put 


n n 
(a eae 27 
7 ; (27) 
It follows from (25) and (26) that, if a > 0, 
Ay(2%n) = Ay(n), Ai,(2%n) = 27 AL), (28) 
E,(2%n) = Ey(n), E,(2%n) = 2°"E((n). (29) 
If n is odd, 
Ai(n) =Ay(n), E,(n) = x(n)E\(n). (30) 
In particular, by (26) and (30), 
Eo(4n + 3) = Ej(4n 4+ 3) = 0. (31) 


The number of representations of a number as a sum of s squares may be expressed in 
terms of the representations by less than s squares by means of the formulae 


Razp(n) = >- Ra(m)Rp(n — m). (32) 


m=0 
and 


(at+tb+c+4d)latbictd! “ 
ea = eae aie Eola Y= Raclm)Rpa(n—m). (33) 


m=0 


The proof of (32) is obvious. If Ri, ,(m) is the number of representations of m in the 


[A.V 
form 
xPHxpte bx typ tygte ty, 
where x1, x2,...,X, are odd, and yj, y2,..., yy are even, then clearly, 
Guise vy)! 
Ryv(m) = a Te 


and (33) follows since 


n 
Ri ybcrd) = >_ Ri Cm) Ry g(n — m). 


m=0 


As an illustration of the use of formulae (32) and (33), we have 


Re.o(8n) = ) | Ra.o(m)Ra4,o(n — m) = ) | Ra.o(8m — 4)Ra.o(8n — 8m + 4). 


m=0 m=1 
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If we assume the truth of (9) for the moment, we obtain 


n 
Rg.o(8n) = 256) Ay(2m — 1)A\(2n — 2m + 1). 
m=1 


The right hand side may be written as 256 )° xy, where the summation is extended over all 
solutions a, b, x, y of the equation 


ax + by =2n 


for which abxy is odd. Thus the problem of evaluating Rg 9(81) is reduced to the inves- 
tigation of the solutions of this equation. This example is typical of the methods used to 
derive (2)-(24). We denote by T(n) the set of all positive a, b, x, y which satisfy 


ax + by =n, x, y odd, 


and write 7;(n) for the subset of T(n) for which ab is odd. Clearly 7; (n) is empty unless 
n is even. 
Further, we denote by S(n) the set of all positive &, n, x, y that satisfy 


Ex+ny=n, xyodd, (&,n) = 1. 


We write T(n), T\(n), or S(n) below the summation sign to indicate a sum carried out over 
all solutions of T(n), T\(n), or S(n), respectively. If there are no solutions (e.g. ifn = 1), 
such a sum is empty and has the value zero. 

Let f(x, y, a, b) be any function of x, y,a,b. Put d = (a, b),a = dé, b = dn. Then 
we have 


Lea) =), > fF Oyede an): (34) 


T(n) d\n S(d‘) 

Similarly, if n is even, 
LOSS) Tae oy): (35) 
Ti (n) 5|n S(d’) 


If 
f(x, y, ca, cb) =c" f(x, y, a, b) (36) 


for every c, (34) and (35) may be written 


>, fx, y,a,b) = > od” D> f(x,y, 6.0), (37) 


T (n) d|n S(d') 
» EGS 8) Fab): (38) 
T\(n) dln $(8") 


Equation (38) holds only if n is even. 
The preceding formulae show that we need only consider solutions of S(n). 
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Write 
én = ae —(—1)"} 
aaek 


We assume that n > 1, as otherwise S(n) is empty. 
Let C(n) be the set of solutions of S(”) for which 


l§ — n| = dn. (39) 
When n is odd we write C(n) = Ci (n)+C2(n), where C; (n) is the set of solutions of C(n) 
for which € = n — 1, and C2(n) is the set for which € = n + 1. 


The solutions of C(n) are easy to obtain, and are given in the following three Lemmas. 


Lemma 1 /fn is even C(n) consists of the in solutions 


nN. 


Ni 


x=2u-1, y=n-2ut+l1, E=n=1, 1le<uK< 
This follows, since € = n by (39), and therefore € = n = 1, since (&, n) = 1. 


Lemma 2 [fn is odd C,(n) consists of the solutions 


n 
e=[—|. n=E+1, x=2t-y, yon — 2Et, 
where t is any integer satisfying 


1 
O<t< a (40) 


Write 2t = x + y for any solution of C(n). Then clearly (40) holds, and the lemma follows 
since 
2&t < 2Et+y=Hn < 2ét4+ 21. 


We have, similarly, 


Lemma 3 /fn is odd C2(n) consists of the 5(n — 1) solutions 


pwadd, aml 2], emwaty, po tens 


where t is any integer satisfying (40). 


We write c(n) for the number of members of C(n). By Lemmas 1, 2, and 3, 


= Ln (n even), 
cag {2 —1 (nodd). ora 


We now consider the properties of another subclass of solutions of S(n). Let C’(n) be 
the set of solutions of S(n) for which x = y. Then we have x(€ + n) =n. Thus x must 


Sums of Squares: An Elementary Method 377 


be a divisor of n. Let 5 be any odd divisor of n. Then, if there is a solution of S(7) with 
x = y = 6, we must have, by (27), 
E+n=6. (42) 
It follows that 6’ > 1, ice. 
6 <n. (43) 


If (43) is satisfied, there are exactly #(6’) solutions of (42); here ¢ is Euler’s function. 
For, since (€, 7) = 1, the only values which & can take are the #(6’) numbers less than 6’ 
that are prime to 6’. 


Lemma 4 For any positive integer n 


5408) _ | n(n odd), 


5n (n even). 
d|n 


The case when n is odd follows from the well known result, which holds for odd and 


even Nn: 
> o(d) =n. 


d|n 
Suppose that n is even of the form 2%m, where m is odd and a > 0. Then 


Y- 98) = 5 b(2%)@(m/d) = 2°"! Sod) = 27" m = sn. 


b|n b|n d|m 
Lemma 5 The solutions of C'(n) are given by 
x=y=6, €=u, n=S—-u, O<u<J, 


where 6 is any odd divisor of n which is less than n, 6’ = n/6é, and (u, 6’) = 1. The total 
number of solutions is c(n). 


The first part of the lemma has already been proved, so that it remains to prove the second 
part. The number of solutions of C’(n) is 


c(inh= > $(8). 
d|n.d<n 
By Lemma 4, if n is odd, 
c'(n) =) (8) -1l=n-1, 
d|n 
and, if m 1s even, 
l 
/ = 5’ ey 
c(n) = ) 96) = 57 
5|n 
It follows from (41) that c’(n) = c(n). 


We now show that the solutions of S(n) may be divided into sets, or ‘chains’, each 
associated with a unique solution (x, y, €, 7) of S(n). 
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Theorem 1 The solutions of S(n) may be divided into c(n) chains of solutions such that 
each solution belongs to one and only one chain. Each chain consists of a sequence of 


solutions (Xm, Ym, &msm)(m = 1, 2,...,q), with the following properties: 


61 —m| =n, Xq = Yq: (44) 


Also, ifq>1,l1<m<gq, 


O < |xm41 — Ym-+1! << ea Yl; (45) 

On = lon = Thal = eng Nm+11; (46) 
Xm+1 + Ym+1 = xm — Yl, (47) 

Em+1 — Nm4+1 = —(Em + Nm)SEn(Xm — Ym); (48) 


X m4) — XOm41) = —{(X Xm) + Xm) 880m — Ym): (49) 


Proof: Since n > 1 there exists at least one solution for which 
E—nl=o,, x=y. (50) 


Any such solution we set in a chain by itself. Clearly (44) is satisfied. If n = 2 or 3, there 
are no solutions other than those given by (50), so that the theorem holds. C) 


We suppose, therefore, that n > 3. Then there exist solutions for which |§ — n| > 1, and 
also solutions for which x 4 y. We consider those two cases separately in what follows. 
Suppose that we have a solution with x 4 y. Write 


€ = sgn(x — jy). 
We can generate a new solution by means of the following transformation, 


xo = Ax —(A+e)y,€ =2E + (A — €)n, 
"= -(A-e)x tay, n = (A+ 6)E +An, (1) 


where A is an integer to be chosen. From this we obtain the inverse transformation 


Ax’ + (A +6)y,& = 2E'- (A), 
(A —€)x’ +Ay’,n = —(A+6)E + An’, (52) 


Xx 


y 


It is evident from (51) and (52) that (E’, 7’) = 1, and that both x’ and »” are odd for 
all choices of A. Also &’x’ + n’y’ = n. It remains to show that we can choose A so that 
E’, yn’, x’ and y’ are all positive. For x’ and y’ to be positive we must have 


Ey < A(x — y) < €x, 
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1.e. 

min(x, y) max(x, y) 

ee < a an reread 

Ix — yl Ix — y| 

The two numbers on the left and right of 4 in this inequality differ by unity, and neither 

is an integer, since |x — y| is even and both x and y are odd. Hence there exists one integral 

value of 4 and one only for which x’ and y’ are positive. By (53), A > 1, and hence &’ and 

n’ are positive. It follows that the new solution belongs to S(n). By (51) 


O< (53) 


chy See) = l= yl, (54) 

and 
E' — ni = —e(E + 0). (55) 

Also 
lx — yl =x'+y' > |x’— y'l, (56) 

and 
If —n'|=E+n> lé —-ni. (57) 


It follows from (55) that 
€ = —sgn(&' — 7’). 
If x = y (mod4), 
x’ =-—ey (mod4), y’ =ex (mod 4), 
so that 
X(x") — x(9") = —e{x (x) + xO}. (58) 
If x = —y (mod 4), x(x) + x(y) = O, and 


x’ =(2A+e)x (mod4), y =—(2A—€)x (mod 4), 


so that 
x(x’) = (-1)*ex(x), X00") = (-1)*€x 0), 


and therefore (58) holds in this case too. 

If x’ A y’, we can continue this process and arrive at a new solution, and so on. By (56), 
we Shall eventually obtain a solution for which x” = y”. 

We can also generate new solutions by proceeding in the opposite direction. Suppose 
that (x’, y’,&’, n’) is a solution for which |é’ — n’| > 6,. Then we obtain a new solution 
(x, y, &, 7) by the transformation 


ux t+(ute)y, & = pe —(u—e')r’, 
(u —€')x’— py’, n= —(u tee! + par’, (59) 


| 


Xx 


y 


where 
e’ = —sen(E' — 7’). 
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Clearly x and y are both odd, and if we solve for &’ and n’ in terms of &’ and n’ we can 
show that (&, 7) = 1. Also &x + ny =n. It remains to show that wu can be chosen so that 
—E, 7, x and y are all positive. For € and n to be positive we must have 


e'&’ < p(n’ — &') < en’, 


min(&", 7’) max(&’, 77’) 
= ery eee 
IE" — 7 | ig) 9. 

The two numbers on either side of yz in (60) differ by unity, and neither is an integer since 
|&’ — n'| > 1. This is obvious if n is odd; if n is even &’ and 7’ are both odd, so that &’ — n’ 
is even, and hence|é’ — n’| > 2. Hence there exists one integral value of wz and only one for 
which & and 7 are both positive. By (10), u is positive, and hence x and y are also positive, 
and it follows that the solution (x, y, €, 7) belongs to S(n). By (59), 


(60) 


x-y=-e(x'+y’), (61) 
and 
E4+n=—e(&' —7')=(|& — 1. (62) 
Hence 
e’ = sgn(x — y). 


Thus (61) and (62) are identical with (54) and (55) with e’ in place of €. Formulae (56), 
(57), and (58) may be deduced in the same manner. 

If | — n| 4 6, we can continue the process and derive a new solution, and so on. By 
(57), we shall eventually obtain a solution (x*, y*, &*, n*) for which |&* — n*| = 6. 

Now e’ = € since each is equal to —sgn(&’ — n’), and on comparing equations (52) 
and (59) we see that we must have A = yp, since both are uniquely determined. Thus to 
each solution (x, y, €, 7) of S(n) we can assign a unique successor, if x 4 y, and a unique 
predecessor if |& — n| 4 6,. Thus we have shown that every solution of S(”) is a member 
of a unique sequence or chain of solutions, and that the members of any such chain have 
the properties (44)-(49). Since |&1 — n1| = 6, for the first member, and x7 = yg for the 
last member of each chain, it follows that each chain corresponds to a unique solution of 
C(n) and to a unique solution of C’(n). There are therefore c(n) chains. This completes 
the proof of Theorem 1. It may be remarked that the number gq is not necessarily the same 
for different chains. 

Let Wo(u, v) and yw (u, v) be any two functions of the integers u and v, such that Wo is 
an even function of both u and v, and yy is an even function of u and an odd function of v. 
Then, by Theorem 1, for two successive members of a chain, indexed by m and m + 1, 


Wo(xm — Yns &m + Nm) = Wo(Xm41 + Vn4t, &m+1 — Nm+1) (63) 


and 


Wi (xm — Ym, &m + Nm {X (Xm) + XOm)} 
= WiC%m+i + Ym4i, &m41 — N41 {X A&m41) — XOm4+1)}- (64) 
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Put 
fox, y,§,n) = Woe + y,§& —n) -— Wox — y,FE +n), (65) 


and 


fix, y,&,n) 
=wWitet+y,& —n{x@) - xO} -Wia-—y,E +n {x)+xGQ)}. (66) 


Then we have, by (63) and (64), for any chain of more than one member, 


q 
De fo(Xm; Ym, &m, Nm) = Wolx1 + y1,&1 — m) — WoO, Eg + Nq)s (67) 


m=1 


and 


q 
> fi(%m, Ym, Em, Nm) 


m=1 


= Wile + 1, &1 — m){XO1) — xO1)} — 2~1 0, Eg — ng) xX). (68) 


Formulae (67) and (68) also hold for a chain containing only one solution. It may be 
noted that it is possible to prove results of the same type for functions w(u, v) that are 
odd functions of u, but such results have no application since the sums over S(n) of the 
corresponding functions f vanish. 


Theorem 2 /f fo(x, y,&, 7) and f\(x, y, €, n) are defined as in (65) and (66), we have (1) 
ifn is odd, 


(n—1)/2 
Y> folx. 8.0) =2 SY) Wol2t, 1) — YF) o(8) Yo, 8) + WoO, 1), (69) 


S(n) t=1 d\n 


4{n—24+x(n)} 
> file, 8m) =4x(n) 2 Wi(4u, 1) 
S(n) u=1 
~ 25° x(8)¥100, 8’) (5) + 2x(n) 1, 1), (70) 
d\n 
and (ti) ifn is even 
l / 
de folx, y. &.n) = 5 ¥o(n, 0) — Db )¥o, 8), (71) 
S(n) 3|n 


Y= fie, yn) = -2 >) x6)O(6'W10, 8). (72) 


S(n) b|n 


382 R.A. Rankin 
By (67) 


YS) folx, ¥€.n) = >> vole + y,€-—n)- D> Yo0,E +n), 


S(n) C(n) C'(n) 


and (69) and (71) follow from Lemmas 1, 2, 3 and 5. Also, by (68), 


> fi, y Em) = Dowie ty, € — xe) — xO} -—2 DS Wi, E +m) x(n). 


S(n) C(n) C'(n) 


By Lemma 5, 


2S° WiO,E +n)x(n)=2 D> x(8)G(8)W1 0, 6’). (73) 


C(n) b|n.d<n 


To evaluate 


So vile ty, € — Mixx) — x0} 


C(n) 


we observe that x (x) — x(y) = Ounless x + y =O (mod 4), So that it is only necessary 
to consider values of x and y which satisfy this congruence. 
Suppose that n is odd. Then 


n=&(x+y)+(n—-€&)y=(n—€&)y (mod 4), 
and therefore 
X(x) — XC) = —2x(y) = 2x (a) x (E — 1). 
Thus 
do vie + y, € = M{x@) — xO) = 2x(2) SO (iu, 1) - Wi(4u, -D} 
C(n) l<u<n/4 


q{n—24+y(n)} 


=4x(n) So Wi4u, 1). (74) 


u=1 


Finally, if n is even, 


So vite +. € — mx) — xO)} = Win, 0) SO (xe) -— xO} =0, (75) 


C(n) X+y=n 


by Lemma 1. 
Equations (70) and (72) now follow from (73), (74) and (75). 


4 Two Squares 


We assume (1). As stated in §2, it may be proved by elementary methods. Equations (7) 
and (8) are particular cases of (1). 
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Take 7 
Wo(u,v) = 5 —cosmu/2), 
so that, by (65), 


f(x,y) = fox, y, €,n) 


= ~ {cos = (x + oi )| = sin = sin —y = y(x)x(y) 
= SOR y) Coors) |e BSUS, Mme ANAND): 


2 
By Theorem 2, if n is odd, 
(n—1)/2 
Y faw= > -(-D = s(n — x(n)}. 
S(n) t=1 
If n is even, 
oe l mh = l n/2 
> $9) — ri (1 — cos =n) = rut — (—1)""*}. 
S(n) 
Hence, by (34), 


>», (6.9 = > (5 — x} = 51 1(2n + 1) — Eo(2n + 1)}, 


T (2n+1) 2 smal 
and 


JES) = 5 D6 - x0) 4 5 atl 


T (2n) d|2n S(d) b|n 2 “an 


ae ly 
5) 1 (1) 7 o(n). 


It follows from (76) and (77) that 


I= s(2+(- 1)")A1(n) — Eo(n)}. 


T(n) 
Also, by (35), 


_! 
» FO. =e —(-1*} = {1 - (-D"}410). 


T (2n) 2 an 
Now, by (32) and (78), 


n—l n—1 


ms 


1)*) 


383 


(76) 


Ce) 


(78) 


(79) 


Ra(n) = ) | Ro(m)Ro(n — m) + 2Ro(n) = 16 | Eo(m)Eo(n — m) + 8Eo(n) 


m=1 m=} 


16 > f(x, y) + 8£o(n) = 8(2 + (-1)"}A1(n). 
T(n) 
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Also, by (7), (31), (33) and (79), 


n 
R4.o(8n +4) = )— Ro.0(4m + 2)R2,0(8n — 4m + 2) 
m=0 
2n 
16 © Eo(2m + 1)Eo(4n — 2m + 1) 


=0 
16 )) f(x, y) = 16A\(2n + 1), 
T; (4n+2) 


and, by (7), (8), (29), (31), (32) and (76), 


2n 
R31(4n+ 3) = 2 De R2.9(2m + 2)R1.1(4n — 2m + 1) 
m=0 
2n 
32 > Eo(2m + 2) Eo(4n — 2m + 1) 
m=0 
4n+2 
16) Eo(u)Eo(4n +3 — 1) 
p=! 


16 S° f(x, y) = 8A1(4n + 3). 


T (4n+3) 


In a similar manner it can be shown that 


R2.2(4n + 2) = 24 a f(x, y) = 24A,(2n + 1), 
T| (4n+2) 


and (by using the fact that Ro,.2(4m) = Ro2(m)) 


Ri 3(4nt1)=16 Y° f(x,y) +2R11(4n + 1) = 8A, (4n+4 1). 
T (4n+1) 


6 Six Squares 


Take Wi (u, v) = Sv. Then, by (66), 


1 l 
f(x,y,&,n) = ea — MXC) — XO + SE + MIxG) + xO)} 
= Ex(y)+ nx), 


and, by Theorem 2, 
S> fie. y€ 0) = 9) 8'G6)x 6), 


S(n) 5|n 


(80) 
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when n is even; and therefore, since (36) is satisfied with v = 1, we have, by (38) and 


Lemma 4, 
> fey => sy: AG.2) 
T; (2n) 6|2n S§(2n/d) 
2n 2n 
r) — — é 
» yo (*) x 1) 
b|n bil F 


= 207 5x6) YO o(*) 


d,|n b|n/d 


2 
2) x(61) (+) = 2E}(n). (81) 


6, |n 


Write 
Uiny= Yd xx), (82) 
ax+by=n,2|ax 
Uiny= YY xx), (83) 
ax +by=n,2fax 
and 
U(n) = Uo(n) + Ui (n). (84) 
By (80), (81) and (82), 


l 
Uin)= Di xx) = Dd) axd= 5 Do lax) +bx0) 


T, (2n) T, (2n) T, (2n) 
| 
= 5 Yd) fils, y, a,b) 
T, (2n) 
= E4(n), (85) 


since a and b are odd as well as x and y, for members of Tj (2n). 
With the help of (85) we can evaluate Up(n) and U;(n) in the general case. For, by (1), 
(2), (7)-(.2) and (33), 
5 n—|1 
5 > R2.2(4m + 2) R2.9(4n — 4m — 2) 


m=0 


R42(4n) 


n—| 
— 240 bs A, (2m + 1)Eo(2n — 2m — 2) 
m=0 


= 240U,(2n), (86) 


n 
R42(8n +4) = 15 > © Ryo(8m + 4)Ro.2(8n — 8m) 
m=0 


= 960U,(2n + 1) + 240A, (2n + 1) (87) 
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5 n 
Ro4(4n +2) = 5 Y > Ro,2(4m + 2) Ro,2(4n — 4m) 


m=0 
— 240U;(2n + 1) + 60A,(2n + 1), (88) 
n 
Ro.4(8n +2) = 15) Ro.0(8m + 2)Ro,4(8n — 8m) 
m=0 
n—| 
= 1440 § > Eo(4m + 1)A1(4n — 4m) + 60E0(4n + 1) 
m=0 
2n—1 
= 1440 $* Eo(2n + 1)Ai(4n — 2m) + 60E9(4n + 1) 
m=0 
— 1440U0(4n + 1) + 60Eo(4n + 1), (89) 
n 
Ro.4(8n +6) = 15 5° Ro.o(8m + 2)Ro,4(8n — 8m + 4) 


m=0 


n 
480 5 Eo(4m + 1)A1(4n — 4m + 2) 
m=0 
2n 
480 © Eo(2n + 1)Ai(4n — 2m + 2) 
m=0 


480U0(4n + 3), (90) 


n 
Ro,o(8n +6) =) Ra.o(8m + 4)R2,0(8n — 8m + 2) 


R3,3(4n + 3) 


m=0 
n 
64 ye Eo(4n — 4m + 1)A1(4m + 2) 
m=0 
2n 
64 »- Eo(4n — 2m + 1)A,(2m + 2) 
m=0 


64U0(4n + 3), (91) 


5 n 
5 YS °{R3,1(4m + 3)Ro.2(4n — 4m) 
m=0 


+ R,.3(4m + 1)R2,.0(4n — 4m + 2)} 
2n 
80 a Aj (2m + 1)Eo(4n — 2m + 2) + 20A)(4n + 3) 


m=0 


80U 1 (4n + 3) + 20A, (4n + 3), (92) 
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and 


Ro (n) 


>, Ram) Ro(n — m) 


m=0 


= 96U (n) — 64U,(n) + 4Eo(n) + 8{2 + (-1)"JA1(n). (93) 
It follows from (85), (86) and (87) that 


l 
qin + 2) — A\(2n+ 1)} 


U,(2n + 1) 
1 
= q(Ea(2n + 1) — A} (2n + 1)}. (94) 
This formula may be combined with (85) to give 
I / I n 
U\(n) = A E,(n) — 5 —(-1))Ai(n) ¢. (95) 


By (88), (89) and (90), 


24Up(4n +1) = 4U\(4n +1) — Ep(4n4+ 1) + A, (4n+ 1 


= E,(4n+ 1) — Eo(4n + 1), 


and 
24U0(4n + 3) = 12U,(4n + 3) + 3A) (4n + 3) = 3E4(4n + 3), 


which combine to give, by (30) and (31), 
l 
Up(2n4+ 1) = 54 (2B 2 (2n +1) — Ex(2n + 1) — Eg(2n + 1)}. (96) 
From (94) and (96) it follows that 


l 
U(n) = 54 (8E2(n) — E2(n) — Eo(n) — 6A\(n)}, (97) 


if n is odd, and we shall show that this holds also when n is even. For we have, ifn = 2%™m 
where m is odd and @ Is positive, 


U(n) 


>> U\(2?m) + U(m) 
p=1 


\- E5(2?m) + U(m) 
p=1 


i | 
= 302" — 1)E\(m) + 54 (8E2(m) — Ex(m) — Eo(m) — 6A, (m)} 


l l 
= 3 Ex(n) be 54 E2(n) + Eo(n) + 6A) (n)}, 
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by (28) and (29), and this is the same as (97). It follows from (83) and (95) that 


Up(2n) = ace U\(2n) 


= 57 (BEL) - Ex(n) — Eo(n) — 6A, (n)} = U(n). 


This may also be deduced immediately from (82), (83) and (84). 
Formulae (3), (13), (14), (15) and (16) now follow from (86), (88), (91), (92) and (93), 
with the help of (85), (94), (95), (96) and (97). 


7 Eight Squares 
Take Wo(u, v) = iu. Then, by (65), 


fox, y,§,n) = f(x,y) =xy. 
It follows from Theorem 2 that 


(n—1)/2 


OFS y)=2 > i = n(n? — 1), 


S(n) 


when n is odd, and 


Y f@y=s 


S(n) 


when n is even. 
Therefore, if n is odd, by (34), 


l 
I) = |= 86? = 7p (As(n) — Arn}. (98) 


T (n) 2 on 


Suppose that n = 2m, where m is odd and @ Is positive. Then 


>) fy) 


» { 36" —1)+ go (2° +28 4..428| 


T(n) d\n 
= + | 586 - Die a — 8 
12 
5|n 
| 
= —({12A45(n) —5A3(n) — 7A} (n)}, (99) 


84 
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and, by (98) this holds also when n is odd. Also, by (35), 


> f(x,y) = A3(n). (100) 


T\(n) 
We now apply these results. We have 


n—-l 
Rg(n) = 2Rg(n) + ) | Ra(m)Ra(n — m). (101) 


m=1 
If n is odd one of the numbers m, n — m is odd and the other is even, so that, by (2), 


n—l 
Rg(n) = 16A,(n) + 192 \° Ai(m)A\(n —m) 
m=l1 


16A\(n) + 192 9° f(x, y) 


T(n) 


16A3(n). 


Also, by (2), (28), (99), (100), and (101), 


n—-l 
48A1(n) +64 >: Aj (2m — 1)A\(2n — 2m +1) 


m=1 


Rg(2n) 


n—| 
+ 5/6 >. A; (2m) A ,(2n — 2m) 


m=] 


48A\(n) +64 D> f(x,y) +576 > f(x,y) 


T; (2n) T(n) 


= = (8A5(2n) — 15A3(2n)}. 


Formula (4) follows from the last two results. 
By (9) and (100), 


n—| 


Rg.o(8n) = )) Rao(8m + 4)Ra.o(8n — 8m — 4) 
m=0 
n—-1} 
= 256 5° Ai(2m + 1)A\(2n — 2m — 1) 
m=0 


= 256) f(x,y) = 256A4(n), 
T) (2n) 
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which is formula (17). By (11) and (33), 


n—-1| 


35 
ae S > Ro.2(4m + 2)Ro.2(4n — 4m — 2) 


m=0 


R44(4n) 


n—1 
= 1120} Ai(2m + 1)A4(2n — 2m — 1) 
m=0 
= 1120 5° f(x, y) = 1120A3(n), 
Tj (2n) 
which is (18). 
We conclude by proving a formula which we shall require in the next section. By (33), 
(10) and (12), 


n—-1 


Ro2(4n + 2) = 7D) R3,i(4m + 3)R3.1(4n — 4m — 1) 
m=0 
n—-| 
= 112 )° Ay(4m + 3)A1(4n — 4m — 1), 
m=0 
and 
7 n—| 
Ro 6(4n +2) = 7 S* Ri3(4m + 1)R13(4n — 4m + 1) 


m=0 
n—| 
= 112 5° Ai(4m + 1)A\(4n — 4m + 1). 


m=0 


Hence, by (100), 


2n 
112 $° Aj (2m + 1)A, (4n — 2m + 1) 


=0 
112 >> f(x, y) 


T\ (4n+2) 
112A3(2n + 1). (102) 


Ro,2(4n + 2) + Ro.6(4n + 2) 


8 Ten Squares 


Take yj (u, v) = —70, so that, by (66), 


1 3 
filx, y,a,b) = 5{a°x(y) + bx (a)} + sab {ax (x) + bx(y)}. (103) 


Sums of Squares: An Elementary Method 39] 


It follows from Theorem 2 and (38) that 


Yd fies yEmM=4>— x6)d (+) (zy. 


§(2n) d\n 
and 
> fiG@.y.ab) = Dost So fi y.é.m 
T) (2n) dyln = S(2n/61) 
2n n \?> 
= +E Z (BB) 
d p> 56] 56] 
n\3 2n 
= 4) (5) x© > (=) 
b|n by |n/d 
= 4) °(8')*x(8) 
d|n 
= 4E,(n). (104) 


On the other hand, by (103), 


n—1 


> file, y,a,b) = SY {A3(2m + 1)Eo(2n — 2m — 1) 


Ti (11) m=0 
+ 3E4(2m + 1)A, (Qn — 2m — 1)}. (105) 


Put N = 2°N,, where a > O and N; = 3 (mod 4). Write 


Vi(N) = )) Eo(4m+ 1)Ai(N — 4m — 1), (106) 
V3(N) = )> E2(4m+3)A1(N — 4m — 3), (107) 
Vo(N) = )) E2(2m)Ai(N — 2m), (108) 
Vj(N) = Y | E3(2m)Ai(N — 2m), (109) 
W(N) = > A3(2m + 1)Eo(2N — 2m — 1), (110) 


where in each case the summation is extended over all m for which the arguments of the 
functions A), A3, Eo, E2 and E, are positive. 
It follows from (104), (105), (106), (107) and (110) that 


W(N) + 3{V\(2N) — V3(2N)} = 4E4(N). (111) 
By (33), (102) and (110), 


N-1 
45 
a RG eee TAR ON) Y | {Ro.2(4m + 2) + Ro,6(4m + 2)} 


m=0 


x R2.9(4N —4m —2) = 10080W(N). (112) 
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Also, by (2), (3) and (32), 


N-1 
Rio(N) = ) | Ro(m)Ra(N — m) + Ro(N) + R4(N) 


m=1 
N-1 

32 } > {4E5(m) — Ep(m)}{2 + (-1)" Ai (N — m) + Ro(N) + Ra(N). 
m=1 


Hence we have, if a = 0, 


Rio(N) = 288V)(N) — 480V3(N) + 128V5(N) — 32Vo(N) 
+ 8A\(N) + 20E3(N), (113) 


and, if a > O, 


RioQ(N) = 96V\(N) — 160V3(N) + 384V5(N) — 96Vo(N) 
+ 24A1(N) + 16E5(N) — 4E2(N). (114) 
The following results may be proved in a similar manner by means of (33), and (1)-(4), 


(7)-(8). The group of four numbers after each formula indicates the values of (a, b, c, d) 
used in applying (33). 


Ro6.4(2N1) = 3360V5(N1) (4.2.2.2): (115) 
= 13440V\(N)) (4,0, 2,4), (116) 
= —40320V3(N1) + 1680E5(N1) (6,0, 0, 4), (117) 
Ro.3(2N1) = 1440 {V\(N1) — 3V3(N1)} + 180F5(N1)(2, 4, 0, 4), (118) 
Rg2(4N) = —1440V3(2N) (6,0, 2, 2), (119) 
= 11520V)(N)(a = 0) (4,0, 4, 2), (120) 
= 11520{V\(N) — V3(N)}(a@ > 0) (4,0, 4, 2), (121) 
R46(4N) = 3360{Vi(2N) — V3(2N)} (2, 4, 2, 2), (122) 


R46(N) = 13440 {4V5(N) — Vi(N)} + 3360A\(N)(a = 0) (4,0,0,6), (123) 
= 26880 {V5(N) + 3Vi(N) — 3V3(N)} + 3360E5(N) 


(a = O)(, 2, 0, 4), (124) 
= 26880 {3V5(N) + Vi(N) — V3(N)} + 3360E3(N) 
(a > 0)(4, 2, 0, 4). (125) 


By (112), (119) and (122), 
V\(2N) — 3V3(2N) = W(N), (126) 


and therefore, by (111), 
2V\(2N) — 3V3(2N) = 2E,(N). (127) 


Sums of Squares: An Elementary Method 


We consider first the case a = 0. It follows from (115), (116) and (117) that 


| | 
Vi(N1) = qYo(N1), V3(N1) = — 5 (2V0(M1) — E3(N1)}, 


and, from (122), (123), (124) and (128), that 
l 
Vo(N1) = gai), Vi (2N1) — V3(2N1) = 16V9(N1). 


Thus we have, in addition to (115) and (120), 


R4.6(4N1) = 53760V5(N1), Ro,8(2N1) = 728Vo(N1), Ri0(N1) 


= 240V4(N1), 


and therefore, by (112), 
W(N1) = 32V9(N1). 


It follows from (111) that 
1 
Vo(N1) = 50 Ea ND): 


Thus we have, by (127), (128), (129), (131) (132), 


1 
Vi(N1) = 5 B4(N1), V3(M1) = ecrr 


8 ] 
Wi(N1) = =EQ(M1), Vi(2N1) = Ey (21), 
5 4] 


Ly 
V3(2N\) = age 


The functions Vo(N1) and V5(N1) are given by (129) and (132). 
We now suppose that a > 0. By (119) and (121), 


V3(2N) + 8V\(N) — 8V3(N) = 0. 


By (127) with 5N in place of N, 
2V\(N) — 3V3(N) = = E4(N). 
Eliminating V|(N) between (136) and (137), we obtain 
Va(QN) + 4V3(N) + 5 B4(N) = 0. 
which may be set in the following form: 


Vat Se 9 EA 2N) = ~4] VAN) + 35 TUDE 


l 
= {E4(N1) — 10E3(N1)}, 
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(128) 


(129) 


(130) 


(131) 


(132) 


(133) 


(134) 


(135) 


(136) 


(137) 


(138) 
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It follows from (135) by repeated application of (138) that 


l 
V3(N) = ~ 79 Fat). (139) 
Hence, by (126) and (137), 
1 8 
Vi(N) = 9 Bat): W(N) = 5 E4(N). (140) 
By (122), (125), (139) and (140), 
l l 
Vo(N) = Bea) — 54 £2). (141) 


It remains to evaluate Vo(N) when N is even. By (108), (139) and (140), 


N-1 N-1 
Vo(2N) = > E>(2m)A\(2N — 2m) = > E,(m)A\(N —™m) 
m=| m=1 
= Vo(N) + Vi(N) + V3(N) = Vo(N). (142) 


Similarly, by (129) and (133), 
Vo(2N1) Vo(N1) + Vi(N1) + V3(N1) 


l | I 
ZOU) = 799 Hath) — 54 26N1)- (143) 


It follows from (143) by repeated application of (142) that 


Vo) = ZA1(N) ~ S- Ea(Ni) ~ 53 Ba(N). (144) 

Formulae (5), (19) and (21) fora > O now follow from (114), (119), (122), (139), (140), 
(141) and (144). 

In the preceding analysis it has been assumed that N; = 1 (mod 4). It might be thought 
that formulae for the case N; = 3 (mod 4) could be obtained in a similar manner. This, how- 
ever, is not the case. For if we set up the equations corresponding to equations (113)-(125) 
we find that we cannot eliminate the five functions (106)—-(110) with the help of (111) and 
(112) and obtain formulae for the number of representations R2y 10-20. The reason for this 
is, aS is Well known, that in the general case it is necessary to introduce a new function which 
is not expressible in terms of divisor functions. This new function is Glaisher’s function 
x4(n) which is defined by 


x(n) = 9 (a + iby’, 


where the summation is carried out over all representations of n as a7 + b?. When n is not 
representable as a sum of two squares, e.g. when n = 3 (mod 4), y4(n) vanishes. 
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9 Twelve Squares 


Take wWo(u, v) = 


16 


] 


u 


4 so that, by (65), 


l 
f(x,y) = fox, y) = 5 (xy x? yy: 


By Theorem 2, if n is odd, 


and, if n is even, 


Write 


ey 


S(n) t=} 


| 

ho 

~ 
aS 


= ee — 1)(3n? — 7), 


240 
So fy) = or 
‘ 32 
S(n) 
2n—1 
Z(2n) =} Ai(m)A3(2n — m), 
m=1 


n—-|\ 
Z'(n) =D) Ai(m)A3(n — m), 


m=1 


Z\(2n) = ae A} (2m — 1)A3(2n — 2m + 1), 


m=1 


It follows from these definitions and from (34), (35), (145) and (146) that 


ZC = > f6.)=>. >, $@9) 


T\ (2n) 5|2n S(2n)/5 


| 
= 30 A;(2n), 


and if m = 2%m, where m is odd and a > 0, 


Z(2n) 


a+] 
> fEN=) > Jews > >. >. Jap 
Tj (2n) d|2n S(d) d|n B=0 §(28 5) 


1 
= 8 (8? 2 19366 ST) PULP ton 22 
> | 320% ( + h(1 +254... 4-25) 


ahs {3A5(2n) — 10A3(2n) + 7A, (2n)} 
ae 3 {A5(2n) — As(2n)}. 


395 


(145) 


(146) 


(147) 


(148) 


(149) 


(150) 


(151) 
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We have, by (2), (4) and (32) 


2n—1 
Ra(2n) + Rg(2n) + ) | R4(m)Rg(2n — m) 


m=1 


= R4g(2n) + Rg(2n) + 128Z;(2n) 


Rj2(2n) 


+ 384 3 A\(2m){8A4(2n — 2m) — 15A3(2n — 2m)} 


m=] 
= 24A\(2n) + 2{8A4(2n) — 15A3(2n)} 
+ 512Z)(2n) + 284{8Z'(2n) — 15Z(2n)}. (152) 


Similarly, by (33) and the formulae already proved, we obtain 


R4.3(8n) = 126720Z;(2n), (4,0, 0, 8) (153) 
Rg.4(8n) = 126720Z)(2n), (4,0, 4, 4) (154) 
= 3041280Z’(n) + 126720A4(n). (8, 0, 0, 4) (155) 


It follows from (150), (153) and (154) that 
R4.3(8n) = Rg .4(8n) = 3960A; (2n). (156) 
Finally, by (155) and (156), 


l 
Z'(n) = 5a | 5(n) — A3(n)}, (157) 


and (6) follows from (150), (151), (152) and (157). 


10 Concluding Remarks 


If we examine the formulae that have been proved we notice certain relations connecting 
different types of representations, such as 


5 Re6.0(8n -+ 6) = 2R33(4n + 3) 


and 
3R4.0(8n + 4) = 2Ro.4(8n + 4). 


These are particular cases of the following three theorems. 
Theorem 3 Leta, B,a,bandk be non-negative integers such that 
2k =a (mod4), k=a (mod4), O<k <4, 


and suppose that 
Ra p(8n + 2k) = ARg py (4n +k) 
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for all positive 4n + k and fixed A. Then 
Ra+2m.p(8n + 2k + 2m) = AmRa+m.b+m(4n + k +m) 


form > 0, where 


fag OO ee eee 
- (a + B)\(a 4+ 2m)\(a + b+ 2m)!a! 


Theorem 4 /fa, B,a, b and k are non-negative integers such that 
k=az=a (mod4), O<k <8, 


and if 
Ra p(8n +k) = BRa.p(8n + k) 


for all positive 8n + k and fixed B, then 
Ro +2m,p(8n +k+2m) = Bm Ra+2m.p(8n + k + 2m) 
form > 0, where 


_ plat b+2m)jlala + da + 2m)! 
me "(a+ Ba + 2m)'(a +b + 2m)!a! 


Theorem 5 /fa, B,a, band k are non-negative integers such that 
k=az=a (mod4), a<k <4, 


and if 
Ry p(4n +k) = CRqap(4n+k) 


for all positive 4n + k and fixed C, then 
Roa+m,p+m (4n +k +m) = Cm Raimbim(4n+k +m), 


form => 0, where 


m= 


In all the applications of these theorems a + b = a + B so that 


alia+m)!(b+m)! 


A = pee .\ ) 
= (a + 2m)!a!b! 
Li, = poe 
(a + 2m)!a! 


a!B'\(a+m)!(b+m)! 
(at+tm)\(B+m)!lalb! 


(a+ B)'(a+m)!\(B+m)"a+b+2m)!a!b! 
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(158) 
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We may prove Theorem 3 by induction as follows. Assume that (158) holds for a certain 
value of m. Write 


a’ = a+2m,a'’=a+m,b'=b4+m,k' =k+m,n' =n+4 [k/4], 
(a + B+ 2)(@' + B+ 1) — a’ +b'+ Da’ +b’ +2) 


Dy = oo: Em = 
(a’ + 2)(a’ + 1) 2(a’ + 1)(b' + 1) 


Then, by (33), 


n’ 
Ro'42,p(8n + 2k! +2) = Dm Y > Rap(8n — 8 + 2k’) Ro,0(8 + 2), 
p=0 


and 
n' 
Rag ior4i(4n +k! +1) = Em D> Ray (4n — 4u + k)Ri i 4u + 1). 
=0 
But 
R2,9(8u + 2) = Roi(4u 4 1), 


and it follows by induction, since Am Dm = Am+1Em, and since (158) holds for m = 0. 
Theorems 4 and 5 may be proved similarly. 


Theorem 6 As particular cases of Theorem 3 we have the following: 


Ga) fr > 90, 
2" (r')2 
Rech Gr = = Re Ane): (159) 
(2r)! 
ai) [fr > 0, 
ar-lyt 2)! 
R2,,2(8n + 2r) = Bee ow +t). 
(2r)! 


Formula (159) was proved by Glaisher by elliptic function theory. 


Theorem 7 As particular cases of Theorem 4 we have the following: 


Gi) [fr > 1, 


———— : n+ rr). 
r 4,4 1( 1)! r.0 


(i) [fr > 0, 
8!(2r)! 


Ragone tS aR nor dy. 
2r+4.4(8n + 2r + 4) Alar 44)! 2r,g(8n + 2r + 4) 
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Solution of the Basic Problems of Discrete 
Geometry on the Plane 


S.S. Ryshkov, R.G. Barykinskii, Y.V. Kucherinenko 


1 Motivations, Literature and Basic Notions 


§1.1 Discrete Geometry and Geometry of Numbers 


19 


. There are two geometric disciplines which have the sufficient large intersection. Now 


we mean the discrete geometry and the geometry of numbers. It is accepted to refer various 
problems about dispositions of points and figures in a space to the discrete geometry. The 
same problems are accepted in the geometry of numbers, when they are somehow connected 
with point lattices in the n-dimensional euclidean space E”, and also some other problems 
about lattices are accepted as well. 

The majority of successes in the basic problems of the discrete geometry forn > 2 
is reached, where they were managed to be reduced to the problems of the geometry of 
numbers, and solved it there. 

2°. To the basic problems of the discrete geometry are usually referred: 


1) Problem about the densest packing of equal balls in the Euclidean space. 
2) Problem about the thinnest covering of the Euclidean space by equal balls. 
3) Problem about enumeration of all parallelohedra. 


4 


5 


~~ 


~~ 


Nevertheless, it is often forgotten, that the following problem had been standing for 

many years. 

Problem about enumeration of all groups of Euclidean space motions, which have a 

compact fundamental region. 

It is explained by the fact, that only this problem of the discrete geometry is reduced 

completely at any n (by the famous Schonfliess-Bieberbach theorem) to the problem 

of the geometry of numbers, and for n < 4 it is solved with concepts of geometry of 

numbers and algebra. 

As the theorem of Schonfliess-Bieberbach has numerous good detailed accounts, 

especially forn = 2, see for example [1], we shall not be concerned with problem (4). 
It seems to us that another simple problem, which is solved for all n, that should be 

returned to the basic problems of discrete geometry is the following: 

Problem about characteristic criteria of lattices, which mark them out among other 

point systems. 


3°. The case n = 2 is sharply distinguished from the common case by the fact, that the 
problems (1), (2), (3) are solved immediately within the framework of the discrete geometry. 
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In this case all theories and concepts (the theory of point systems, the theory of L- and 
DV -tilings, the theory of lattices, the theory of parallelohedra etc.), which accompany 
solutions of the mentioned problems, are easier and more obvious, than for n > 2. These 
circumstances have induced us to write this paper. 

4°. We shall note in inference, that the paper is a summary of a terminal course of lectures, 
which was delivered by the first author at the mechanical-mathematical department of the 
Moscow State Lomonosov University. Thus, it is “text-book” and survey of results at the 
same time. When delivering this course, many references including the following books 
and papers were used, the authors recommend them to the reader. 


[1] D. Hilbert, and Cohn- Vossen, S., “Anschauliche Geometrie” // Berlin, 1932. 

[2] L. Fejes Toth, “Lagerungen in der Ebene auf der Kugel und im Raum” // Springer- 
Verlag, Berlin - Gottingen - Heidelberg, 1953. 

[3] B.N. Delone, “The geometry of positive quadratic forms” // Uspehi Mat. Nauk, v. 3 
(1937), p. 16-62; v. 4 (1938), p. 102-164. (In Russian.) 

[4] S.S. Ryshkov, “Theory of point lattices” // (In Russian, under preparation.) 

[5] P.M. Gruber, and Lekkerkerker, C.G., “Geometry of numbers” // North-Holland 
Math. Library, v. 37. North-Holland, Amsterdam - New York - Oxford - Tokyo. 


5°. After the numbers of theorems we shall sometimes specify, as for example: 

n.n — analogical theorem is correct (quite right) at any n, the given proof is also spreaded 
on any n, 

n.2 — analogical theorem is correct (quite right) at any n, but the given proof is not 
spreaded on any n, 

2.2 — theorem is correct only at n = 2. 

Besides we systematically use the following lables: 

> — the beginning of a proof, 

e — the end of a proof, 

>e — a proof is obvious, 

iff — if and only if. 

6°. We are very grateful to all mathematicians, on whose results the paper is based. 

Finally, our gratitudes go to E.P. Nikiforova who helped us in the translation of the paper 
into English and to Z.D. Lomakina who carefully read the Russian version of the paper. 

The first author is supported by RFFI, grant No. 97-01-00266. 


§1.2 Dispositions. Local Finiteness 
1°. Throughout the paper E? denotes the euclidean plane with an orthonormal system of 
coordinates (O, &|, 2), which has the origin in some fixed point O € E?. 

The word “figure” everywhere means a bounded Jordan-measurable subset of the plane. 
The word “convex set” means a convex set of the plane, containing at least one interior 
point. 


Definition: An arbitrary denumerable system T = {7), To, ...} with sets 7), 7o,... C E* 
is called a disposition (of these sets). 
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Generally speaking, some of elements (sets) of a disposition or even all of them can be 
pairwise congruent, and some of them can intersect or even, for example, coincide with 
each other. 

Further, we shall consider dispositions of bounded sets only, without any special reserve. 

One of examples of dispositions is any arbitrary (denumerable) point system © (see 
below), i.e. points can be considered as elements of a disposition. 

A disposition of sets, in which any two of them don’t intersect (i.e., if 7; 1 Tj = @, for 
i,j =—1,2,...andi # j), is called an exact (disjunctive) packing (of these sets). 

A disposition { = {7,, To, ...}, for which 


OO 
eee ap HF 


i=1 


is called a covering of the plane E? (by these sets). 

A disposition & = {7), T2, ...}, being both a covering and an exact packing, is called an 
exact (disjunctive) tiling of the plane E? (by or with the sets). The sets 7; are called tiles. 

A disposition { = {7,, T>,...}, in which any two elements don’t intersect on interior 
points (i.e., if int7;N intT; = @, fori, j = 1,2,...andi # /), is called (classical) packing 
(of these sets). 

A disposition { = {T), T>,...} of closed sets, being both a covering, and a (classical) 
packing, is called (classical) tiling of the plane E* (by or with the tiles 7;). 

When reviewing a classical tiling below, we shall not stress the fact, that all its tiles are 
closed sets. 

Further, if it is clear what kinds of tilings or packings we are speaking about: exact or 
classical, or if some statement is right for both of them, we shall use only the words “tiling” 
or “packing’’. Besides, it is natural, that we shall not mention, that we consider tilings of 
the (whole) plane. 


Definition: A disposition T C E? is called regular, if there exists a group of plane motions, 
which is acting transitively on this disposition. 


Definition: A disposition © C E? is called translationally regular, if there exists a group 
of parallel translations of the plane E* which is acting transitively on this disposition. 


As an example, let’s consider an “infinite” sheet of checked paper. The family of closed 
squares of this sheet form a regular classical tiling and the family of open squares form a 
regular packing. The families of edges and knots are regular dispositions. The dispositions 
of squares and knots are also translationally regular. At the same time the disposition of 
edges is regular, but not translationally regular, because the group transitively acting on this 
disposition necessarily contains the turn of 90°. 

29, A disposition { = {7), T>,...} is called locally finite, if any point A € IE? has a 
neighbourhood intersecting only finite number of elements (sets) of the disposition T. 


Lemma 1 nun. For a disposition & to be locally finite, it is necessary and sufficient, that 
any bounded set M C E* has nonempty intersection only with a finite number of elements 
of the disposition &. 
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> The sufficiency is trivial. Let’s prove the necessity. Let there be given a bounded 
set M. If its closure intersects only a finite number of elements of the disposition %, 
then it is also intersects only a finite number of elements of {. Therefore, we can consider 
the set M closed. Let’s construct a neighbourhood of each point of the set M, which has 
nonempty intersection only with a finite number of elements of the disposition {. The 
obtained system of the neighbourhoods is an open covering of the compact set M. We 
choose a finite subcovering from this covering. It is obvious, that the set M intersectes only 
a part of those elements of {, which intersect the union of all elements of the constructed 
subcovering.e 

If the topological theorem used in the proof is unknown to the reader, he can include the 
set M ina sufficient large square and take its points’ neighbourhoods homothetical to this 
square. Further, to reach the purpose readers can use several times the Heine-Borel lemma. 


Lemma 2 n.n. Let for some packing T there exist numbers r > QO and R > 0, such that 
for any set T; € & there is a closed comprehending disk of radius R and an open inserted 
disk of radius r, then the packing & is locally finite. 


pic. 1.01 


> Let’s take an arbitrary point A € E* and e > 0 and describe the disks U and V of radii 
€ and 2R + € around the point A. Each set 7; € Y, for which 7; 1 U # Q, is contained in 
the disk V together with their own disks of radii R and r (pic. 1.01). But being inserted 
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in the sets 7; € Y the disks of radii r are not pairwise intersected. Therefore, there are, 
obviously, not more, than (28te)? of them in the disk V.e 


3°. Theorem 1.1 n.n. Each locally finite tiling with bounded convex sets is a tiling with 
convex polygons. 


> As all sets of the tiling are convex, that for any two sets, having not less then one point 
in the intersection, there exists a straight line, containing their intersection (pic. 1.02). It 
means that each two elements of the tiling are either not intersected, or have a common 
point, or have acommon segment. From local finiteness we infer that every set of the tiling 
intersects with a finite number of other sets.e 


Let’s introduce some notions, which we will need further in different measure. 

A packing of convex polygons, in which each two polygons are either not intersected, or 
have a common vertex or a common edge, is called a face-to-face packing. 

A face-to-face tiling 1s such a face-to-face packing, which 1s also a covering. 

Let’s consider a face-to-face tiling with convex polygons, then faces of the tiling are the 
polygons themselves, their vertices and edges. A facet of a convex polygon is its edge 
(1-dimensional face). 

A tiling with convex polygons is facet-to-facet if any facet of a tile is also a facet of 
another tile. 


Lemma 3 n.2. A locally finite tiling with convex polygons is face-to-face iff it is facet-to- 
facet. pe 


406 S.S. Ryshkov, R.G. Barykinskii, Y.V. Kucherinenko 


A face-to-face tiling with convex polygons is primitive, if in each vertex of the tiling only 
three edges converge. 


§1.3 Point Systems. Delone Systems 


1°. We'll call by a point system X C E? an arbitrary denumerable set of points (an arbitrary 
disposition of single-point sets). 


Definition: A point system © C E? is called a (1, 0)-system, if for some r > 0 it has the 
following “r-property”’: 

In each open disk U C E? of radius r, not more than one point, belonging to the system 
x, will be discovered see pic. 1.03. The least upper bound of the set of such numbers r 
will be denoted by ry. 


pic. 1.03 


It is obvious, that each (1, 0)-system is locally finite, i.e. is a locally finite disposition of 
points. 


Definition: A point system & C E? is called a (0, 1)-system, if for some R > 0 it has the 
following “R-property”’: 

In each closed disk U C E? of radius R, at least one point, belonging to the system & 
will be discovered see pic. 1.03. The greatest lower bound of the set of such numbers R 
will be denoted by Ry. 
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Definition: Point system © C E? is called a (1, 1)-system or a Delone system, if it is both 
a (1, 0)-system and a (0, 1)-system (pic. 1.03). 


Lemma 4 n.n. The affine image of any (1, 1)-system ((1, 0)-, (O, 1)-system) is again a 
(1, 1)-system ({1, 0)-, (O, 1)-system). ce 


Usually (with the replacement of the radius r to r/2 in the definition of a (1, 1)-system) a 
(1, 1)-system is called an (r, R)-system or a uniform discrete system. Sometimes a (1, 1)- 
system is called an (r, R)-system, meaning that it has the r-property and the R-property 
exactly just for givenr > Oand R > O. Theconstants ry, Ry are called the basic constants 
of . 


Lemma 5 n.n. Each (r, R)-system & is an(ry, Ry )-system, but it is not an(r’, R’)-system 
at any r' > ry and (or) R' < Ry. ve 


Examples: Let’s consider on the Cartesian plane x Oy three systems %1, Y2, 3 of points, 
which have the following coordinates (x;, yj), where x; = S;, yj = S; : i,j € Z. Let, 
firstly, So = 0, S; = S;-) +i, S_; = —S; withi = 1,2,... (pic. 1.04). It is obvious, that 
the obtained system %, is a (1, 0)-system (for any r < v2) but it is not a (0, 1)-system. 
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Let, secondly, So = 0, S$; = S;-1+ i, S_; = —S; withi = 1,2,... (pic. 1.05). In this 
case the point system > (its local finitness follows from divergence of the harmonic series) 


is a (O, 1)-system (for any R > 2) but it is not a (1, 0)-system. 
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Let the system %3 coincides with the system %; in the right half-plane and with the 
system %> in the left one. Then, it is easy to see, the system %3 is neither a (0, 1)-, nora 
(1, 0)-system. 


2 L-tilings and DV -tilings 


§2.1 L-tilings. Empty disk (“la sphére vide”). Necessary and sufficient criterion 
1°. Let us be given a (1, 1)-system © C E?’. 


Definition: Every convex figure L C E? iscalled an L-figure, of the (1, 1)-system © C E?, 
if L is the convex hull of a subset &’ C &, satisfying the following conditions: 


1) &’ c L°, where L° — some circle. 

2) %X° uniquely determines the circle L°. 

3) L* 1X = &’, where L* — the disk, bounded by the circle L°, and the interior of the 
disk L* is empty from points of the system ©. 


Solution of the Basic Problems of Discrete Geometry on the Plane 409 


The circle L° is called an L-circle of the system & and the disk L* is called an L-disk 
of &. Further, through {L}5, {L°}s and {L*}», accordingly, the families of all L-figures, 
L-circles and L-discs of the system & will be denoted. (pic. 2.01.) 


Lemma I n.n. Every L-figure, of an arbitrary (1, 1)-system is a convex bounded 
polygon. 


> The set &’ is finite as a bounded subset of the given (1, 1)-system. As the set ©’ 
uniquely determines the circle and, thus, has rank 2, the lemma is proved.e 


Lemma 2 n.n. (“Empty disk,’) Every (1, 1)-system has at least one L-polygon. 


> Let us be given a (1, 1)-system © C E~. Let’s consider “la sphére vide”, that is a disk, 
containing neither inside, nor on the boundary points of the system & (pic. 2.01). Let’s 
move by any mode this disk, simultaneously increasing its radius. At some moment our 
disk will meet, generally speaking, one point of the system ©; let it be the point A. Then 
we Shall move our disk in such a way incessantly increasing its radius, that the point A will 
remain on the boundary of the disk. At some moment our disk will meet one more point 
of X; let it be the point B. Then we shall continue to increase radius of the disk in such a 
way, that the points A and B will remain on the boundary of the disk. At some moment our 
disk will meet one more point of ©; let it be the point C. Every point from the mentioned 
ones exists, that on account of the fact, that any disk of radius Ry is to contain points of 
the system ©. Thus, the system of the points which have appeared on the boundary of this 
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disk determines it completely. The obtained disk, obviously, is an L-disk of &, the convex 
hull of the points of &, lying on its boundary, is an L-figure of the system L.e 


Lemma 3 nun. For every (1, 1)-system &, the sets {L}y, {L°}s and {L*}y are denumer- 
able. That means, that these sets are dispositions. pe 


OY 


Theorem 2.1 n.n. The family {L}s of all L-polygons of an arbitrary (1, 1)-system 


¥ CE? isa locally finite face-to-face tiling of the plane. 


> Let us be given a (1, 1)-system © and all its L-polygons. Let’s divide the proof into 
few parts. 


I) 


2) 


3) 


We shall remark, that every bounded set intersects only with a finite number of L- 
polygons. Really, let MC E* be such a set, d — its diameter and A € M. The 
point A can belong only to those L-polygons, whose vertices are away from it by not 
more than 2Ry, thus the diameter of the set of vertices of those L-polygons, which 
can intersect the set M, doesn’t exceed d + 4Ry. The finitness of the number of 
L-polygons, with such a set of the vertices, is obvious. 

Let’s show, that the set {L}y is a face-to-face packing. 

Really, let two different L-polygons L; and Lz have common points. Hence, it 
follows immediately, that appropriate for them L-disks L} and L; are different, but 
their intersection is not empty. Let’s consider a chordal straight line P of these disks. 
This straight line divides each of the disks into two parts — “caps”, it will be more 
convenient to us to consider both the caps of every disk closed. One of two caps of 
the disk L* lies completely in the disk L5 — we shall call it the interior cap and another 
one the exterior cap (with respect to the disk L5). The according names shall be given 
to the caps of the disk L5. It is obvious, that points of the system & can be located 
only, firstly, outside of the intersection of the disks L} 1 L3 — we don’t need now 
such points and, secondly, in the intersection of the L-circles L} and L> (there are 
vertices of the polygons L, and L2, pic. 2.02). The vertices of the polygon L; (ZL2) 
lies on the boundary of the exterior cap of the disk L} (L5). Hence, it follows that 
the intersection L; M Lz lies in the intersection of the exterior caps, 1.e. the polygons 
L, and Lz adjoin each other either on a vertex or on an edge. 

Let’s consider now any polygon L; C {L}y and one of its edges F. Let’s also 
consider the straight line P, which contains F. The center O of the disk L} lies on 
the perpendicular to P in the point O’ — the middle of F. The straight line P divides 
the disk L} into two caps (see above), one of which contains points of the system 
x only in the ends of the edge F — the points A and B of intersection L) with P. 
We shall put on the straight line OO’ the point O; on the distance t from the point 
O in the leg of the empty cap (pic. 2.03). Further, we shall construct the disk U;, 
with the center at the point O;, intersecting the straight line P in the points A and B. 
For small t the disk U; contains only the points A and B. For large values f, in the 
disk U; are contained new points of ©, we shall designate through t’ the least of such 
t. It is obvious, that the disk U, contains new points only on the boundary. From 
here implies, that the disk U, is an L-circle, and the edge F together with the new 
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points determine the L-polygon L2 € {L}»y, adjacent with the polygon L, on F. So, 
for any L-polygon we have found other £-polygon, adjacent with initial on an initial 
beforehand given edge. 

4) It remains for us to show, that each point A € E? lies at least in one polygon of the 
set {L}y. 


Let’s consider an arbitrary point A € E’, an arbitrary polygon L € Ly, a point O € 
intL and the segment [O, A]. The segment [O, A] is intersected with a finite number 
of L-polygons, because the disposition Ly is locally finite (pic. 2.04). Without loss of 
generality, we may choose the point O’ € intL, that the half-interval [O’, A) will intersect 
L-polygons only on their interior points and interior points of their edges (““Lemma about 
shashlik”’, n.n.) Then, applying a finite number of times the result of item (3), it will be 
possible to find the L-polygon, which contains the point A.e 

Let © Cc E’ bean arbitrary (1, 1)-system. The above theorem allows us to introduce the 
following definition: 


Definition: The tiling {L}» is called the L-tiling or the Delone tiling of the (1, 1)-system 
& (for XZ) (pic. 2.01). 


Corollary: n.n. (from theorem 2.1.) For any (1, 1)-system & the tiling {L}y ts unique. pe 
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pic. 2.03 


We shall note, that for any (1, 1)-system © the star of any vertex of the tiling (L}y, Le. 
the set of all its tiles, incident to this vertex, is finite. It follows from the local finitness of 
the L-tilings for such systems. (The star of any edge of the tiling {L}», 1-e. the set of all 
its tiles, incident to this edge, consists of two tiles.) 

We now make another one simple, but important statement. 


Theorem 2.2 Let some (1, 1)-system & be translated into itself by some motion (of first or 
second sort), then the tiling Ly 1s translated into itself by this motion too. ve 


3°. L-tiling. (Necessary and sufficient criterion.) Let us be given some (1, 1)-system 
© C E’ and a locally finite face-to-face tiling © with convex polygons. 

We shall say, that the tiling { is compatible with the (1, 1)-system &, if every point of 
x is a vertex of {, and every vertex of Y is a point of XZ. 


Theorem 2.3 n.n. Let T be a tiling compatible with some (1, 1)-system &, than the tiling 
& is the L-tiling of &X, iff: 


1) Every polygon of the tiling & is a polygon, inscribable in a circle. 

2) For any pair of polygons of the tiling Y, adjacent on an edge, any of their ver- 
tices, which don’t belong to this edge, lie outside of the disk, which is circumscribed 
around the other one. 


> The necessity of the specified conditions follows from the definition. 
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Let’s prove the sufficiency. Let 7; be an arbitrary polygon of the tiling { and 7," be its 
disk, circumscribed around it. It is obvious, that for the proof of the theorem it is enough 
to prove the emptyness of the disk T;* from points of &, which are not vertices of 7}. 

Let’s consider an arbitrary point A € &, which is not a vertex of T;. As in the item (3) 
of the proof of theorem 2.1., let’s choose a point O € int7T; and consider the polygons, 
intersected with the half-interval [O, A). For more convenient description of the situation, 
let’s consider, that the segment [O, A] is horizontal and that the sequence of the polygons 


goes from left to right. Let these polygons be numbered in sequence 7), 7>, ... , Ti, so, that 
A € Tm. Let T*, where i = 1,2,...,m, be the disk, circumscribed around the polygon 
T;, and let P;, where 7 = 1,2,...,m — 1, be the straight line, containing the intersection 


Tj 1 Tj +1 (pic. 2.05). 

By our construction of the segment [O, A], the point A lies more to the right than all P;. 
Let’s consider the polygons 7,,_; and T,,. By the condition (2), the point A does not belong 
to the disk T7_,. The interior cap of the disk T7 _, relatively to the disk J _, is contained 
inside T7_,, that means that the cap doesn’t contain A. And as the point A lies more to the 
right than the straight line P,, —2, itis not contained in the disk T7_, (pic. 2.05). On carrying 
out sequentially such a reasoning for the remaining polygons T;,i = m — 3,m—4,..., 1, 
we Shall have, that the point A lies outside Ti That means (once again), that whatever 
polygon 7; € & is given, any point A € %, not being its vertex, cannot lie either inside or 


on the boundary of 7,".e 


4°. Example 1 Let’s consider a tiling ©, by regular triangles (with an edge a). Let D 
and 2 be the point systems consisting accordingly of the vertices of {, and of weight 
centers of {,’s triangles (pic. 2.06). It is obvious, that the L-tiling of © is 2,4. (Discover 
L-tiling of X.) 
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Example 2. Let’s consider the system ¥, consisting of points of the lattice Z* and 
all points (x, y) = (k + 1/4,1 + 1/2), where k and 1 are integers. The system © is a 
(1, 1)-system with basic constants ry = /10/4 and Ry = 5/8. 

Let’s consider the triangles (pic. 2.07): 


Ty = {(—3/4, 1/2), 0, 1), U/4, 1/2)}, 
Ty = {(—3/4, 1/2), (0, 0), (1/4, 1/2)}, 
Tz = {(0,0), (1, 0), (1/4, 1/2)}, 
Ty = {(0, 1), C1, 1), (1/4, 1/2)}. 


It is easy to see, that the set & of all triangles, each of them being parallel congruent to one 
of four constructed, will form face-to-face tiling of the plane, compatible with the system 
x. 

Let’s check, that Y is the L-tiling of &. The first condition of the sufficient criterion 
is trivial. For checking-up of the second condition we shall write out the equations of the 
circles, circumscribed around our four triangles: 


TO = {(X +1/4)? + (¥ — 9/16)? = 65/256}, 
T? = ((X +1/4)? +(Y —7/16)? = 65/256}, 
Te = {(X —1/2)°+(¥ — 1/16)? = 65/256}, 


Te = {(X —1/2)? +(Y¥ — 15/16)” = 65/256}. 


Vertices of triangles, adjacent with researched ones, have, correspondingly, such 
coordinates: 


T):{(—1, 1), CL, 1), (0, 0)}, 
T2:{(-1, QO), (0, 1), (1, O)}, 
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T3:{(—3/4, 1/2), (5/4, 1/2), 0/4, -1/2)}, 
T4:{(—3/4, 1/2), (5/4, 1/2), (1/4, 3/2)}. 


Substituting these coordinates in the equations of circles T,’, T;’, T;’ and T7’, we can convince 
that the condition (2) is carried out and % is an L-tiling. 

59 (n.n). The content of this item is not used further, therefore all information, which is 
given in it, is given without proofs. 

Let «€ > 0. Then a (1, 1)-system 2’ is called €-close to a system ©, if points of the 
system &’ is possible to be numbered in such a way, that for any natural i the following 
inequality is carried out p(A;, A;) < €, where Aj € & and A; € ©’. 


Theorem 2.4 Foran arbitrary (1, 1)-system & and any number € > O there exists €-close 
to ita (1, 1)-system &X" such that the tiling Ly: is simplicial. 


The statement “converse” to this theorem is incorrect. That means that there exists such 
a Delone system & with the simplicial tiling Ly, that for any number € > 0, there exists 
close to it a Delone system ©’ which L-tiling is not simplicial. 

(We recommend to the reader to construct such example independently.) 


Theorem 2.5 For such an arbitrary system % of Delone, where the tiling Ly is simplicial, 
and for any compact set F C E? there exists such a number € > 0, that any Delone system 
D’, distinguished from the system & only on €-neighbourhoods of the compact set F, €-close 
to &, has the simplicial L-tiling. 
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§2.2 DV-tiling 


1°. Definition: 2 Let us be given a (1, 1)-system X C E?. A Dirichlet-Voronoii domain 
(DV-domain) of & with center in a point A € © is the set DV, with the property: if 
X € DVa, then p(A, X) < p(B, X) for all points B € &. 


The family of all DV-domains of & with centers in all points of &, we shall denote 
through {DV }»s (pic. 2.08). 


Theorem 2.6 n.n. For any (1, 1)-system, t.e. a system of Delone, & C E? and for any 
point A € X the domain DV, is a closed polygon, which has such diameter and number 
of edges, which don’t exceed those magnitudes, which depending on the basic constants of 
the system X. 


> Let us be given a (1, 1)-system © anda point A € X&. Any point X € IE”, where 
p(A, X) > Rx, cannot belong to the set DV,, as in its neighbourhood of radius Ry there 
exists a point B € X, obviously, different from the point A. 

Further, the DV,4-domain is the intersection of closed half-planes, bounded by straight 
lines, which are orthogonal to the middles of segments [A, B], where B € & — {A} 
(pic. 2.08). Let’s take notes of the fact, that the half-planes, for which [A, B] > 2Ry, can’t 
take part in formation of the DV,4-domain, as the additions of these half-planes are away 
from the DV,-domain on a positive distance. Thus, not more than [((2Ry + ry)/ ry |* of 
half-planes can take part in formation of the DV,4-domain.e 


Solution of the Basic Problems of Discrete Geometry on the Plane 417 


° B 
DV, 
pic. 2.08 


Theorem 2.7 For any convex polygon M C E* there exists such a (1, 1)-system ©, that 
M €{DV}ys. 


> Let’s choose an arbitrary point A € int M and reflect it in all edges of the polygon M. 
Let A’ be one of such reflected points, F be the corresponding edge and P be the carrying 
straight line of that edge. All points of the plain IE’, which are closer to the point A, than 
to the point A’, together with the straight line P form one of the half-planes, which determine 
the polygon M. Thus, the point A and all its mirror images give a (1, 0)-system, for which 
the polygon M satisfies to all conditions of the DV,4-domain’s definition. 

Let’s construct now a (1, 1)-system, necessary to us. Let the diameter of M be equal 
to d. Let’s take an arbitrary (1, 1)-system, throw out from it all points, which hitting in 
the disk of radius 3d and with center in A and we’ll add instead of them the (1, 0)-system 
constructed above.e 


Theorem 2.8 n.n. The family {DV}» of all DV-domains of an arbitrary (1, 1)-system & 
is a locally finite face-to-face tiling. (This tiling is named the Dirichlet-Voronoii tiling or 
the DV -tiling.) 


> The fact, that {DV}» is a classical tiling, is obvious. Also, it is obvious, any bounded 
set intersects only a finite number of DV-domains of ©. 

Let now for some < 1, 1 >-system © the tiling {DV}s is not face-to-face. Then, there 
is adomain DV, C {DV}ys and its edge F, such that on interior points of the edge F 
there adjoin to the DV,4-domain, at least, two DV-domains DVzg and DVc. But then the 
segments[A, B] and[A, C], not being segments of the same straight line, are simultaneously 
orthogonal to the straight line, which contains the edge F, that is impossible.e 
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We’ ll mark, that the star of any vertex and any edge of {DV}» for any (1, 1)-system is 
finite (pic. 2.08). It follows from the local finitness of DV-tilings for such systems. 


Theorem 2.9 Let some (1, 1)-system be translated into itself by some motion (of first or 
second sort), then the DV -tiling of the (1, 1)-system will be translated into itself by the 
same motion. pe 


Do. Example 1 Let’s consider Y, (pic. 2.06), and systems L; and X2 as well. (See §2, 
item 4°). It is obvious, that the DV-tiling of X2 is the tiling T,. (Discover the L-tiling 
of £}.) 


Example 2 Let’s consider the Delone system ©, constructed in item 3°, §2 (pic. 2.07). 
It is obvious, that there exist D V-domains of two and only two kinds: with integer and with 
fractional centers. Let’s construct these domain. The domain DV 0,9) has vertices (—1/4, 
7/16), (1/2, 1/16), (1/2, —1/16), (—1/4, —7/16), (—1/2, —1/16), (—1/2, 1/16). 


The domain DV(1/4,1/2) has vertices (1/2, 15/16), (3/4, 9/16), (3/4, 7/16), (1/2, 
1/16), (—1/4, 7/16), (—1/4, 9/16). 


§2.3 Duality of L-tilings and DV -tilings 
1°. Let us be given a face-to-face tiling = {7), To, .. .} of the plane E?. We shall denote 
all faces of the tiling by letters, with upper indexes 0, 1, 2 (dimensionalities of the faces), 
meaning dimensionality 2 for tiles. Let, further, the letters r and q (may be with indexes) 
satisfy to the following relation: r + q = 2. 


Definition: Two face-to-face tilings Z and SN of the plane E? are called combinatorially- 
metric dual, if there exists the one-to-one correspondence between faces of the tilings { 
and 9Jt, with the following conditions: 


1) Anarbitrary r-dimensional face of the tiling & (99t) corresponds to some q-dimensional 
face of the tiling St ({). 

2) At the specified correspondence the inclusion T’ Cc T’ (M4 Cc M? ) entails the 
inclusion M4 C M4 (T’ CT’). 

3) Each of the two corresponding faces of the tilings { and 9Jt are mutually orthogonal. 


(If only two of the first conditions are realized then the face-to-face tilings &{ and SN are 
called combinatorially dual.) 


2°. Theorem 2.10 n.2. For any (1, 1)-system XC E? the tilings {DV}y and {L}y are 
combinatorially-metric dual. 


> To begin with we’ll note, that each vertex of the {L}s:-tiling, being a point of &, is the 
centre of some polygon of the {DV}»5-tiling. On the contrary, the centre of each polygon 
of {DV}s, being a point of X, is a vertex of {L}s. That means, that vertices of {L}5 and 
polygons of {DV}» are in one-to-one correspondence. 

Centres of disks, circumscribed around {L}5-tiles and only they have this property: the 
set of points of X, laied from them on the minimum distance, has the rank 2. Therefore, the 
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set of these centres coincides with the set of {DV }y-vertices. That means, that {L}y-tiles 
and {D V}»s-vertices are in one-to-one correspondence. 

It is obvious, that the constructed correspondences satisfy all conditions of the duality. 
Thus, there exists a natural duality of the {DV}»s- and {L}s-tilings in zero and maximum 
dimensions. 

Let’s consider an arbitrary edge L'! C {L}y, with the vertices A and B. This edge is the 
common edge of two {L}s-tiles, which we shall denote by L; and L2. Denote the centers 
of the disks L} and L5 (see §2) by O; and Op (pic. 2.09). According to aforesaid, the 
points O; and O> are 0-faces (vertices) of some two tiles of {DV} s. Let’s show, that there 
are the tiles DV, and DVz, and the segment O, QO? is their common face. The distances 
from any point of the straight line O; O2 up to the points A and B are equal, i.e. the tile 
DVa lies in the half-plane, containing the point A, relatively O; O2. We shall also note, 
that medial perpendiculars of segments, which connect the point A with vertices of the tiles 
L, and L> (not including A and B), intersect the straight line O; O2 in the points O; and 
O2 correspondingly. And medial perpendiculars of segments, which connect the point A 
with another L-vertex X, intersect the straight line O; O2 outside of the segment O; O2. In 
the opposite case X € L} ML; (pic. 2.09). Hence, the segment Oj O2 is a face of the tile 


DVza. Similarly for DVz. 
oe 
[Ne ° 


\ i 


Let’s correspond to every edge L! C {L}» the constructed edge O} O02 C {DV}s. From 
the construction of such a correspondence one can see, that for every L-edge (DV-edge) 
there exists some DV-edge (L-edge), and these edges are orthogonal. 

Let’s prove, that this correspondence between edges (1-faces) of the L-tiling and edges 
(1-faces) of the DV -tiling is one-to-one. 

Suppose it isn’t so. Let tosome L-edge there corresponds not one DV -edge, for example, 
two. But then these edges have the same ends, that reduces to the inconsistency. Let, on the 
contrary, to some DV-edge there corresponds not one L-edge, for example, two. Hence, 
these two edges lie on one straight line. That also is incorrect. 


pic. 2.09 
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Thus, the correspondence between 1-dimensional faces satisfy conditions (1) and (3) of 
the dualities. The condition (2) is checked immediately. 

Thus, the tilings {DV}» and {L}y are combinatorially-metric dual.e 

3°. As acorollary of the proof of this theorem, we have the following important fact: 


Lemma 8 There exists one-to-one correspondence between L-edges and pairs of adjacent 
DV-tiles on the corresponding DV -edges. Every L-edge is orthogonal to the straight line, 
which contains the corresponding DV -edge, and the L-edge is divided by this straight line 
in half. pe 


It enables us to introduce the following definition: 


Definition: (n,-). Every edge of an L-tiling with vertex at a point A and with attached 
direction from the point A is called a contiguous vector of DV 4. 


Theorem 2.11 n.n. Let us be givena Delone system &, then: if the tiling {L}\ is simplicial, 
then the tiling {DV} is primitive. Conversely: if the tiling {DV} s is primitive, then the 
tiling {L}y is simplicial. (pic. 2.10.) 


pic. 2.10 


> The statement directly follows from the duality of the tilings.e 
49. See item 5° §1. 
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Theorem 2.12 n.n. For an arbitrary Delone system & and for any € > O there exists an 
€-close to it Delone system &’ such that the tiling DV: is primitive. 


> The statement directly follows from theorem 2.4, and also from the duality of 
tilings. e 


Theorem 2.13 n.n. For an arbitrary Delone system X, with the primitive DV -tiling, and 
for any bounded set F Cc KE? there exists € > 0, such that any €-close to & Delone system 
D’, which differs from the system © only on e-neighbourhoods of F, has the primitive 
DV-tiling. ve 


59, Example 1 Let’s consider the tiling &~ and the systems X; and X2 (see §2, item 40 
pic. 2.06). It is obvious, that DVs, is {,. (Discover the L-tiling for the system 1.) 


Example 2 Let’s consider the system of Delone & from item 39, §2 (pic. 2.07). It is 
obvious, that there exist DV-domains of two and only two kinds: with integer and with 
fractional centers. Let’s construct these domains, using the correspondence between edges 
and vertices of the DV- and L-tilings. 


The domain DV 0,0) has vertices (—1/4, 7/16), (1/2, 1/16), (1/2, —1/16), (—1/4, —7/16), 
(—1/2, —1/16), (—1/2, 1/16). 

The domain DV(1/4,1/2) has vertices (1/2, 15/16), (3/4, 9/16), (3/4, 7/16), (1/2, 
1/16), (—1/4, 7/16), (—1/4, 9/16). 


3 Lattices 


§3.1 Lattices, its Basic Frames 
1°. Let’s consider in the plane E? an arbitrary frame € = E(O, e}, e2), i.e. two linearly 
independent vectors e;, €2 with the beginning in the same point O € E*, whichis convenient 
to consider coinciding with the origin of the coordinate system (O, &), €2). See § 1.2. 


Definition: The set ['¢ of all points, which have integer coordinates in the frame €, i.e. 
the set of points g = gje; + 22e2, where g1, g2 € Z, is called a point lattice of rank 2 or 
a 2-dimensional point lattice, or it is called simply a lattice (pic. 3.01). (Later we shall 
denote I's through [ if it will not be important which frame determines the lattice.) 


In other way the lattice I" is the affine image of the integer lattice Z* = {(E1, &2) : Ey, & 
E Z}. 

Each frame, which determines a lattice [, is called a basic frame of I’, and the parallel- 
ogram II(€), constructed on any basic frame € is called a basic parallelogram of T, we 
shall denote its area through S(€). 

A vector of a lattice is any vector equal to a vector which has the beginning and the 
extremity in points of the lattice, i.e. any vector with integer coordinates in an arbitrary 
basic frame of the lattice. 
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pic. 3.01 


We will also consider lattices of rank 1, i.e. the set of points {ke : k € Z}, where e is 
an arbitrary vector with the fixed beginning. We will sometimes call a point a lattice of 
rank 0. 

Any subset I"; of a lattice I’, which is a lattice itself, is called a sublattice of T. 


Theorem 3.1 n.n. For any lattice l'¢ the following three statements hold: 


(1) Every frame E’ = (O’, e1, e2), where O' € TI, is basic for T (pic. 3.02). 

(2) The lattice is symmetric with respect to each of its points. 

(3) The lattice is symmetric with respect to the middle of every segment which joins two 
of its arbitrary points (pic. 3.03). 


> Let’s prove (3). Let’s consider any two points P and @Q of the lattice with coordinates 
(P1, p2) and (q}, gz) in the frame €. Then the coordinates of the middle R of P Q are equal 
to (PLZH , P242). Let’s take now an arbitrary point X € T with coordinates (x1, x2), then, 
as one can easily see, the point symmetric to it with respect to R will have the coordinates 
(pi +q1 —X1, P2 +42 —X2), 1.e. integer coordinates in the frame €. The statement (3) is 
proved. 

Hence, if P = Q, we obtain the proof of statement (2). 

The statement (1) is obvious.e 


Corollary of the statement (1). Every lattice is a translationly regular disposition of 
points. 


2°. Lemma 1 nn. An integer matrix L is unimodular iff the matrix L~! reciprocal to L 
is integer. ve 


The set of all integer unimodular (2 x 2)-matrices is usually denoted by GL(2, Z). 
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Theorem 3.2 n.n. Let a lattice Y be determined by a frame E = E(O, e€;, e2). A frame 
E’ = (O, e}’, e2'), where e;', e' are vectors of T, is a basic frame of the lattice T iff the 
matrix of transformation of the frame E into E’ belongs to GL(2, Z). 


> Let the frame €’ be a basic frame of TI (pic. 3.02). Let’s consider the integer matrix L, 
which rows are the coordinate rows of the vectors e)’, e2’ in the frame €. The rows of the 
reciprocal matrix L~! are the coordinate rows of the vectors €}, e€2 in the basic frame €’. 
That means that L~! is also integer. But then from lemma 1 it follows, that L € GL(2, Z). 

If the matrix of transformation of the frame € into €’ is integer and unimodolar, it is easy 
to check up, that each point of I has integer coordinates in the frame €’, i.e. the frame €’ 
is basic one.e 

39. Let vectors e; and e2 of an arbitrary frame € = €(O,e}, e2) have coordinates 
(&11, 21) and (&}2, 22) accordingly in the coordinate system (O, &, &2). See pic. 3.01. 
Then with the frame € two important matrices are brought in line: the coordinate matrix & 
and the Gramian matrix A. That is 


E11 &12 
ENT 622 


| ae 
| 
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4 Q 


pic. 3.03 


and 
(€1,€1) (€1,€2) 
(€1,€2) (€2, €2) 


Q|}1 42 
421 422 


A= = 


As one can easily see, equality A = &'& holds for the Gramian matrix A = |lq; jl| of the 
frame €. 


Let’s note that Se = |detG| = VdetA. 


Corollary 1 n.n. of theorem 3.2. Areas of basic parallelograms of any basic frames are 
equal. 


> In the used notation we have: 
Ser = |det&’| = |det(EL')| = |detS|\|detL| = |\detS| = Se.e 


49 Let € = E(O,e1, e2) be a frame. The frame €7) = €(0, ee eee which 
vectors Satisfy the correlations (e;, oo) —— a le re oe) = 0, with i # j, is called the 
reciprocal frame to €. 

For the frame € not only the reciprocal frame, but also the adjoint frame E€* = €(O, ea 


e>) is considered, with e* = (det Aye", where A is the Gramian matrix of €. 
From the equality 


det A* = (detA)*detA~! = (detA) 
it follows, that 
[oar i 
SE) Sse i S56. 
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Let’s note one of aspects of difference between reciprocal and adjoint frames. 

1) Lengths of the vectors e, i” of the reciprocal frame €‘—!) depend on a standard 
of length, (n.n). 

The following example makes it clear. Let the length of e; be equal to 2 cm, and let the 
angle between e; and ge” be equal to 7, then the length of eo is equal to | cm. But if 
we primary measure lengths of vectors in millimeters, length of e; will not vary, and the 
length of ar will be equal to a mm, that is obviously not equal to 1 cm. 

2) Lengths of the vectors e}, e; of the adjoint frame €* do not depend on a standard of 
length, (2.2). See pic. 3.04. 


o ° ° o oO Q Cc 


> 
bd 
Co 


°e) ° ° rey ° o © 
pic. 3.04 


The lattices, which are determined by the frames € (—)) and €* are called reciprocal and 
adjoint to Vg accordingly, and we shall denote them through P‘~) and I’*. 


Theorem 3.3 nn. A lattice T‘~ is reciprocal to a given lattice T iff all scalar products 
(g, 2’) are integers, where g and g’ are vectors of T and V‘— respectively. 
This theorem gives the arithmetical interpretation of a reciprocal lattice. As we won’t 


need it further, we are leaving its proof for the reader. 


§3.2 Frames, Lattices and Positive Definite Quadratic Forms (PQF) 


1°. Here, as well as everywhere further, we keep all designations of item 3” of §1. 
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Let x1, x2 be the coordinates of any point X with respect to a frame €. Then the square 
of the distance between the origin and X is equal to 


[OX|? = (xjey + x2e2, xje) + x2e2) 


2 2 
= ax} + 2a}2x1x2 +.a22x5 = fe(x1, x2) = fe. 


As the square of the distance between distinct points is positive, the quadratic form f¢ is 
a positive definite quadratic form (PQF). The form f¢ is called the metric form of €. Let’s 
note, that the matrix of f¢ is the Gramian matrix of E. 


Theorem 3.4 n.n. Every PQF of two variables determines (correct to motion) a frame of 
the plane E?. 


> Let’s consider POF f = f(x}, x2). It is known from the linear algebra, and for n = 2 
it is proved trivially, that such a form f can be decomposed by infinite number of modes 
to a sum of two squares of independent linear forms of variables x;, x2. Let us have one of 
such decompositions: 


f = (E11 + E12x2)? + (Ene + £222)" 


Then the form f can be presented in the following manner 


E1yxy + &j2x2 
Eo1x1 + &2x2 


f 


WEyixy + &12x2 Earxy + &22x2 || 


= (xje) + x2e2, x1e) + x2e2), 


where e; and e2 are linearly independent vectors with coordinates (&}1, &2,) and (&12, &2), 
i.e. after a choice of the point O, the frame € = E(O, e), e2) is determined by the vectors 
C1, 2. 

The metric of € does not depend on a choice of form’s decomposition to a sum of squares 
and a choice of O. Really, lengths of the frame’s vectors and their scalar product are simply 
determined by the coefficients of the form f. Compatibility of any two frames with identical 
metric by a motion is obvious.e 

Let’s note that theorem 3.4 makes clear the geometric sense of various decompositions 
of PQF in a sum of squares. 

Let’s also note that theorem 3.4 establishes the one-to-one correspondence between PQF 
and frames given correct to motion. 


Definition: PQF, corresponding to the reciprocal frame € (—1) is called reciprocal to PQF 
fe and is denoted f‘—!), i.e. the quadratic form determined by the reciprocal matrix AW) 
for a given matrix A of PQF f, which is appropriate to a frame €. PQF f*, corresponding 
to €* (adjoint frame to €) is called adjoint to fe. 


29 It is obvious, that it is possible to determine a lattice both with the help of its basic 
frame € and (correct to motion) by PQF fer. Thus, from theorem 3.4 one that there exists 
a one-to-one correspondence between PQF and congruence classes of lattices. In spite of 
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this to the forms f~! and f*, i.e. reciprocal and adjoint to f, there correspond congruence 
classes of the reciprocal I’ --1 and adjoint I" ¢« lattices to Ty. 

Let A and A’ be the Gramian matrices of arbitrary basic frames € and €’ of a lattice 
lr. Let f(X) and f’(X’), where X = (x1, x2)’, X’ = (x)’, x2’)', the metric forms of the 
frames € and €’. Let, at last, as in theorem 3.2, L be the matrix of the transformation of the 
frame E into €’. Then, obviously, we have X’ = (LOY) x = L*X and 


A’ = LAL’. (3.1) 
Let’s consider PQF f and f’ determined by matrices A and A’. 


Definition: PQF f and f’ (the matrices A and A’) are called integral unimodular equiv- 
alent (f ~ f', A ~ A’), if there exists a matrix L € GL(Z, 2) for which (3.1) holds. 


Sometimes we will denote the form f’ through Lf. Further we shall omit the words 
“integral unimodular’. 

As far as we had seen here (see also theorems 3.2 and 3.4), every lattice corresponds to 
an equivalence class of PQF, there exists a one-to-one correspondence between equivalence 
classes of PQF and congruence classes of lattices. 

There immediately appears the problem of a choice in a lattice of its one definite basic 
frame. This problem is solved in the so-called theory of reduction to which the following 
paragraph is devoted. 


§3.3 Theories of Reduction. Minimal Vector. Lagrange Reduction 
1°. A quadratic form f (the frame € f), chosen on the basis of the special requirements 
(“conditions of reductions’) from an equivalence class of forms (frames) (generally speak- 
ing, not necessary positive) is called the reduced form (frame). An algorithm of such a 
choice is called an algorithm of reduction. As a class of equivalence { f} is usually deter- 
mined by its some representative —aform fo (frame € ¢,), an algorithm of reduction consists 
in that fact, that beginning from any form fo to discover the reduced form f ~ fo. Accord- 
ing to one or another condition of reduction there are known various partial theories of 
reduction. 

Though the requirement of uniqueness of the reduced form (frame) for a given equivalence 
class is desirable, in used modes of reduction a finite number of the reduced forms (frames) 
is supposed for some equivalences classes. In such cases with every equivalence class the 
whole population of the reduced forms (frames) is brought in the correspondence. 

The first theory of reduction for PQF of two variables was constructed only in 1768 by 
Lagrange, we give its geometric variant in 3°. 

Gauss also investigated the problem of reduction, however the problem of reduction for 
PQF of three variables was solved only by Zeeber in 1831. Just in the review for this 
Zeeber’s paper Gauss introduced the concept of a lattice for the first time. 

Historically the first developed algorithm of reduction, suitable for PQF of any number 
n of variables, was offered by A.N. Korkin and E.I. Zolotarev. 

The problem of reduction is one of the primary problems in the theory of quadratic forms, 
and to the present time besides the mode of A.N. Korkin and E.I. Zolotarev, there are now 
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available some modes of reduction (n-dimensional, though constructed in details only for 
the first few values of n): Hermite — Minkowski, Zelling — Sharve, Voronoi reductions. 
The method of Delone reduction is popular in the cristallography. Rather general mode 
of reduction, enveloping as special cases some of those mentioned above, was offered by 
Venkov. 

2°. Further, when speaking about vectors of a lattice, we shall consider, that their 
beginning is the origin O (the origin of a lattice). 


Lemma 2 n.n. Jn every lattice T forany R > 0 there exists only a finite number of vectors 
with the lengths not more than R. 


> Let’s take a disk D of radius R with center in O and the affine transformation of I" into 
the lattice Z7 (pic. 3.05). The image of the disk D at this transformation is the set bounded 
by an ellipse, which obviously contains only a finite number of points.e 


ce) a ° a ce) . (e) Oo a 
Q Q Q Q 
a Oo a oO a Oo ce) a 
o 8 o f8f o 8 Oo ® o 8 o 8 o ®f8 o |f8 o 8 
= Oo = Oo 2 0 = 0 = Oo = 0 ee) ne) a 
pic. 3.05 


Definition: A minimal vector of a lattice is its vector which has the minimal positive 
length. 


Lemma 3 /n every lattice T € E* there exist minimal vectors (n.n), and their number is 
not more then 6. This quantity of minimal vectors can be only at the lattice constructed on 
the regular triangle (2.2). 
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> The existence of minimal vectors, obviously, follows from lemma 2. Let’s consider a 
lattice [ € E* with the beginning in O and also the circle C of radius m, which is equal to 
the length m of a minimal vector, with the center in O. All vectors with beginnings in O 
and extremities in the points of the intersection C 1 IT are minimal and their length is equal 
to m. So, distances between the extremities of any such two vectors not smaller than m, 
since their residuals is vectors of the lattice. That means, that the number of points in the 
intersection C NT cannot be more then 6. 

The last statement is already obvious.e 


39. Definition: A basic frame € = €(O, e}, e2) of an arbitrary lattice T° is called 
Lagrange reduced, if the orthogonal projection of every vector e; on e; does not exceed 


eu where i, j = 1,2; i #4 j and the angle between e; and e2 is not obtuse. 


We shall call the set Pr! = {key +le,}, where k € Zis fixed and / runs over Z, as k-series 
of P with respect to a given frame €. 


Lagrange algorithm: We shall consider a lattice T (pic. 3.06) with a basic frame 
E = E(O, €}, €2). Let |e;| < |e2|, otherwise we shall change their indexing. Let’s choose 
a vector in the series 2 such that the orthogonal projection of it on e; has length not more 


than fa If this projection is vectored as the vector e;, we shall denote the chosen vector 
through e, and if it is not vectored as e;, we shall denote the opposite vector to chosen 


through e¢, . 


Two cases are possible: 

1) ler] > Je] and 2) ler] < lel. 

In case 1) we come back to the beginning of the algorithm, by taking for the new vector 
e, — the vector e, and for the new vector e2 — e. 


Lemma 4 2.2. Jn case 2) the vector e, is aminimal vector of T, and e, is one of (not more, 
than 4) the shortest vectors not collinear to e,. Length of the orthogonal projection of e) 


; e’, 
on e, is not more, than a 


> Let’s denote the distance between O- and I-series, i.e. between the straight lines, 
carrying these series a and ry through h. We shall estimate the magnitude 2h. We have: 


2 1,2 
2 e} le,| 
le’ soe UE 


5 a 2h > V3le5| > lesl. 


From here we have, that there are no vectors shorter, than e> in k-serieses, with |k| > 1. 
And in k-serieses, with |k| = 1 itis obvious, that there cannot be more then 4 vectors, which 
have length equil to |e5|. The last statement of the theorem is already obvious.e 

Thus, in case 2), the frame €(O, e1, €5) is the Lagrange reduced frame. 


Theorem 3.5 The algorithm of Lagrange is finite. 


> On each next step of the algorithm we take a vector e; smaller, than we have taken on 
the previous step. But there are only a finite number of vectors which have lengths smaller 
than |e;|.« 
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@ 

pic. 3.06 

Let’s note that there can be a few Lagrange reduced frames with the beginning in the 
Origin in a lattice, but number of them is finite and all of them are pairwise congruent. We 
submit the proof of it and analysis cases to the reader. 

Let also note, that the Lagrange algorithm is one of most effective modes of search of 
minimal vectors of a lattice of rank 2 determined by an arbiirary basic frame. 

In inference we shall aduce the definition of two variables Lagrange reduced PQF. 


Definition: PQF f = ax? + 2a12x1x2 + an2x5 is called Lagrange reduced if 2aj2 < 
ai1,411 < a22,a12 = 0. 


§3.4 “Semiopen” parallelogram. Lemmas of Blichfeldt and Minkowski 


1°. Definition: Let us be given a lattice [ C E’, apoint A € T anda pair A = (a1, a2) of 
linearly independent vectors of this lattice, with the same beginning in A. A parallelogram 
T1(A, A) determined by the point A and the vectors a1, a2 is called a parallelogram of I. 


Rather often, including this paragraph, it will be convenient to consider the parallelogram 
II(A, A), asa “semiopen” parallelogram, i.e. as the following set of points {aa; + Ba2|0 < 
a, B < 1}. 

In particular, a “semiopen” basic parallelogram I1(O, €) of I’ consists of those and only 
those points of the plane E*, which have coordinates (x,, X2) in the frame €, satisfying the 
inequalities 0 < x; < 1, withi = 1, 2. 
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From the translational regularity of a lattice we infer that it is possible to (disjunctive) 
divide the whole plane into semiopen parallelograms, i.e. the following theorem takes 
place: 


Theorem 3.6 n.n. Let us be given a lattice T determined by a basic frame E = E(e, e2). 
Then 
er =) MAE). 
AeF 
It is the disjunctive tiling by semiopen basic parallelograms T1(A, €) (i.e. every point of 
the plane belongs to one and only one such basic parallelogram). 


> Let the point X € IE? have coordinates (x1, x2) in €. Let also A = ({x1], [x2]), where 
through [] Antie symbol is denoted. Then X, obviously, belongs to II(A, €) and doesn’t 
belong to any other parallelogram I1(B, €), with B € T \ {A}.e 

We shall denote the tiling obtained in theorem 3.6 through Te. 

Let’s note, that the plane can also be submitted as a tiling & with semiopen parallelograms 
of I, constructed on an arbitrary pair (a), a2) of its linearly independent vectors. (For the 
proof it is sufficient to take the lattice I, and the tiling & 4.) 


Lemma 5 nun. A semiopen parallelogram T1(O, A) ofa lattice T is a basic parallelogram 
iff it does not contain any points of the lattice except O. 


> The necessity of the condition by virtue of theorem 3.6 is obvious, let’s prove the 
sufficiency of the condition. Let us be given a semiopen parallelogram IT = I1(O, A), with 
A = A(a), a2), in which there are no points of the lattice, except O. Then we shall take 
the tiling & 4 with parallelograms congruent to parallelogram II. As in the parallelogram 
II there are no points of IT’, except one point, and in any other parallelogram of this tiling 
there are no points of I’, except its beginning (here we have used translational regularity of 
the lattices [ and [_,). That is, in other words, that the whole lattice [4 consists only of 
points of I. Hence, I is a basic parallelogram of T".e 

2°. Here for the completeness of the account we aduce the famous lemmas of Blichfeldt 
and Minkowski. They are given without proofs, but one can find them in many references. 

Further, we shall consider, that on E? there is given a lattice T2 with the area S of a basic 
parallelogram. 


Theorem 3.7 n.n. (Blichfeldt’s lemma). Let M bea set with the area S(M) > S (pic. 3.07). 
Then there exists such a parallel transposition A, that #(2U(M) rr?) > 2 (ie. inthe set 
M there will be discovered, at least, one pair of points determine a nonzero vector of T'*). 


Blichfeldt is known, perhaps, for this beautiful lemma. His main achievements belong 
to the theory of packing. Here a really great result belongs to him: he has found the 
densest lattice packings of equal balls atn = 6, 7, 8. Blichfeldt had published the theorem 
completely already in 1935 and since then nothing for n > 8 1s known. 


Theorem 3.8 n.n. (Lemma of Minkowski on a convex body). Let M C E’ be a central 
symmetric convex figure, with the centre at one of points of I? (pic. 3.08). Then, if S(M) > 
4S, M contains, at least, one more point of the lattice. 
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pic. 3.07 


§3.5 Characteristic Criteria of Lattices among Point Systems 


1°, From the definition of a lattice it follows that a lattice is a (1, 1)-system. The following 


theorem gives the condition, at which an arbitrary (1, 1)-system becomes a lattice, . 


Theorem 3.9 n.n. A translationaly regular (1, 1)-system is a lattice. 


> We shall, at first, prove the theorem in case of a straight line ie According to the 
R-property on any segment with length 2R» there will be discovered two points of &, at 
which neighbourhoods of radii ry there is no other points of the system according to the 
r-property. 

Let there be no other points of the system between chosen ones. Then the system 1s a 
lattice with a basic vector, drawn from the first point to the second. Let there be other points 
of the system between them. Then according to the r-property there is a finite number of 
them and the distance between any two of them does not exceed 2ry. Let’s take two nearest 
points and everything will be reduced to the previous case. 

Let’s consider now the Delone system & on the plane. Let’s take any two points of the 
system and draw the straight line / throughthem. The intersection &M/ is an lattice of rank 1. 
Let’s choose the origin as one of points of this lattice. From the R-property it follows, that 
there exists a point A € © not lieing on/ (pic. 3.09). Let the parallelogram, constructed 
on the vector at the origin terminating at A and on the basic vector of the one-dimensional 
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pic. 3.08 


lattice, does not contain other points of &. Then © is obviously a lattice. Let now in our 

parallelogram there are other points of the system. From the r-property infer that there 1s 

a finite number of them. Then we shall choose instead of A another point, nearest to the 

straight line, among them, we obtain a new parallelogram and come to the first case. 
Each of the two following theorems is an immediate corollary of theorem 3.9. 


Theorem 3.10 n.n. Every translationly regular (1, 0)-system = C E? is a lattice of rank 
m withm < 2. 


Theorem 3.11 n.n. Every translationly regular (1, 1)-system (i.e. every translationly 
regular system of Delone) X& C E? is a lattice of rank 2. 


In inference of this paragraph we shall prove the theorem as kind of inverse to 
theorems 3.10 and 3.11, which finally establishes the connection between ((1, 1))-systems 
and lattices. 


Theorem 3.12 n.n. Every lattice T of rank m with m < 2 is a translationly regular 
(1, 0)-system. It is a Delone system iff m = 2. 


> The corollary to theorem 3.1 actually affirms an existence of the number r from the 
definition of a (1, 0)-system for a lattice. Translational regularity of a lattice is also affirmed 
by the corollary. Thus the first part 1s proved. 

The last statement is obvious.e 
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4 Parallelohedra 


$4.1 DV -tilings and L-tilings of Lattices 
Lemma 1 2.2. No L-polygons of a lattice has obtuse angles. 


> Let M be an L-polygon of a given lattice I’ and let it has the obtuse angle between the 
edges AB and BC, see pic. 4.01. The point D symmetric to B with respect to the middle 
of AC belongs to the lattice (theorem 3.1) and lies inside the L-disk, whose boundary is 
determined by the points A, B and C. Hence M is not an L-polygon.e 


Lemma 2 No L-polygons of a lattice can have more than four vertices. 


> Any convex polygon, which has more than four vertices, has at least one obtuse angle.e 


Theorem 4.1 2.2. Every L-polygon of a lattice is either an acute triangle or a rectangle. 
De 


Theorem 4.2 All L-tilings of lattices of rank 2 are divided in two types of affine equivalence: 


1) General (pic. 4.02). In this type the star of every vertex of an L-tiling consists of 6 
sequentially cyclically contiguous to each other triangles. And every two adjacent 
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S 


pic. 4.01 


pic. 4.02 


triangles are symmetric to each other with respect to the middle of their common 
edge. 

2) Special (pic. 4.03). In this type the star of every vertex of an L-tiling consists of 4 
sequentially cyclically contiguous to each other rectangles. And every two adjacent 
rectangles are symmetric to each other with respect to the middle of their common 
edge. 


> Suppose in the L-tiling of some lattice I’, there can be found at least one triangle. 
From the translational regularity of a lattice and theorem 3.1 it follows that we can suppose 
that it is the triangle OAB, see pic. 4.04. From lemma | (and theorem 3.1) we infer, that 
the triangles OAB’ and OA’B symmetric to OAB with respect to the middle of OA and 
O B accordingly are L-triangles. On calculating the sum of the angles we can see, that the 
points O, A’ and B’ lie on the same straight line. Therefore, the construction of the L-star 
of the point O is finished, by the triangles OB’B”, OA” B”, OA’ A” symmetric to the trian- 
gles OA’B, OAB, OAB' accordingly, with respect to O. From the translational regularity 
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pic. 4.03 


B At 


B" 


? 
A pic. 4.04 


and theorem 3.1 it follows, that the L-star of each point of F is constructed. These stars, 
obviously, satisfy to the description of the general type given in the theorem. 

Let in the L-tiling of some lattice T° there are no triangles. Then all L-polygons in this 
tiling are rectangles. The fact, that the tiling satisfies to the description of the special type 
given in the theorem, is easy to check, see pic. 4.05. 
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pic. 4.05 


Definition: A lattice is called general, if its L-tiling has the general type, and it is called 
special, if its L-tiling has the special type. 


Theorem 4.3 2.2. A DV-domain of a lattice is either a rectangle (for special lattices) or 
is a centrally symmetric inscribed hexagon (for general lattices). 

Conversely. For every rectangle (every centrally symmetric inscribed hexagon) there 
exists a lattice, for which this figure is a DV -domain. 


> We shall use the duality of DV- and L-tilings. Let’s remind, that all DVg-vertices, 
with O € TI’, are the centres of all circles circumscribed around the polygons of the L-star 
of the point O. 

In the general case we have a hexagon. It is central symmetric, as an L-star is centrally 
symmetric. It is inscribed, as all triangles of an L-star are pairwise equal. The special case 
is absolutely trivial. 

Conversely. We shall consider an arbitrary convex central symmetric inscribed hexagon 
M, with the vertices U, V, W, X, Y, Z and with the center O (pic. 4.06). Let’s denote 
through A, B,C, D, E, F the points symmetric to O with respect to ZU, UV, VW, WX, 
XY, YZ. 


pic. 4.06 
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Let’s note, see pic. 4.06, that OA = OZ + OU,OB = OU+ OV,OC = OV+0W 
and OZ + OW = 0, because M is a central symmetric hexagon. Hence, OB = OA+ 
OC. From here and again from the central symmetry of M it follows, that the points 
A,B,C, D,E, F have the integer coordinates with respect to the frame (OA, OC \, ie. 
they belong to the lattice I’, which have the frame (OA, OC \ as a basic frame. We shall 
show, that the triangles AOB, BOC, COD, DOE, EOF and FOA form the L-star of [. By 
transferring this L-star on all vectors of I, we shall obtain the triangulation 7 of the plane, 
obviously, compatible with the lattice I. 

Let’s note that each convex central symmetric inscribed polygon, except a rectangle, has 
all angles obtuse. We leave the proof of this elementary fact to the reader. 

Taking into account this note, we have that all angles of the triangles AOB, BOC, COD, 
DOE, EOF and FOA, as supplementary to the angles of M, are acute. From this follows 
the realization of the necessary and sufficient criterion of L-tiling for the triangulation 7. 
Really to each edge of an acute triangle of J there adjoins the central symmetric to it triangle 
of J. It is obvious, that in the disk circumscribed around one of these triangles the third 
vertex of the other does not get into it. 

Thus 7 is the L-tiling of [. According to the definitions, M is the DV-polygon of T 
with the center in the point O. 

The case of the special type Is trivial.e 


§4.2 Enumeration of all parallelohedra on the Plane 


Definition: A convex polygon P is called a parallelohedron iff there exists a tiling B in 
which every tile is parallel congruent to P. 


If the tiling $B is face-to-face, then P is called normal, else abnormal. If the tiling %B is 
primitive, then P is called primitive, else non-primitive. 


Theorem 4.4 2.2. Each parallelohedron, which is abnormal, is a parallelogram. Each 
parallelogram can appear as a normal parallelohedron and as an abnormal 
parallelohedron. 


> As the tiling 5B is not face-to-face, there will be discovered an edge AB of a parallelo- 
hedron Pg € $8, which contains a point C — vertex of some parallelohedra P;, Po € Sf, 
which adjoin to the edge AB on the edges DC and CE accordingly (pic. 4.07a). 

For convenience we shall consider, that the segment AB is horizontal, Po is located above 
AB, and P), P» are below AB, as it is shown on pic. 4.07a. As to the straight line AB there 
adjoin the parallelohedron Pg above and P), P2 below, each parallelohedron of P has the 
“upper” and “lower” horizontal edges (while it is not known, that they have equal length). 
The parallelohedron P? is parallel congruent to P;. Therefore, the (upper) edge CE of the 
parallelohedron P» is equal to the (upper) edge DC of P;. Analogically the upper edge of 
any parallelohedron of our tiling is equal to DC = CE. 

The edge CF of the parallelohedron P2, adjoining to the vertex C and different from 
DC coincides with the edge CF* of P), adjoining to C and different from CE. Firstly 
they lay on the same straight line, otherwise we can choose such a point X, not belonging 
to the parallelohedra P; and P2, laying below AB and rather close to the point C, that any 
parallelohedron of ‘8 covering it will have common interior points with the polygons P| 
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a) 


pic. 4.07a 


and (or) P2, see pic. 4.07b. From the same origin it follows that F = F*. From this it 
follows, that the edges of the parallelohedra P; and P» adjoining to the vertices D, C and 
E are parallel and equal to each other. 


C' 


E' 
D 
F 
b) 
A’ B 
C E' 
F 


pic. 4.07b 


D [es / 


A' B' pic. 4.07c 


c) 


Further, it is obvious, that the lower edges of P; and P2 lie on the same straight line. 
From here and from the reasoning twice used above (pic. 4.07c) we infer that the point F 
must lay on the mentioned straight line. 

The second statement of the theorem is obvious.e 
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Theorem 4.5 2.2. Each normal parallelohedron, which is not a parallelogram, is a cen- 
trally symmetric hexagon. 
Each centrally symmetric hexagon is a normal parallelohedron. 


> Let’s consider an arbitrary normal parallelohedron Po and the tiling $B formed by it. 
At first, we shall note, that to each edge of a normal parallelohedron there exists, obviously, 
the edge equal and parallel to it. Let AB and A’B’ be the such edges of Po, pic. 4.08. From 
the convexity of the parallelohedron we infer that the edges BC and B’D’ of Po should lie 
outside of the parallelogram ABB’ A’. 


P, D 


D" 


B" 


Cc 


pic. 4.08 


The parallelohedron P,; adjoins to Pp on the edge AB. Thus there exists one more edge 
BD, which has B as their end. The appropriate edges B’D’ and B’C" are exist also for the 
point B’. There can be discovered the pair of edges of Pp accordingly equal and parallel 
to the edges BC and B’D’. These edges, by the convexity of Po, cannot belong to the 
sequence of edges of Po connecting the points D’ and C and which is not contains AB. 
Let’s show, that they should have the common vertex B”. Really, the parallelohedron P3 
contiguous to Po on the edge BC, should have the pair of edges, equal and parallel to BD. 
These two edges should be placed inside the angle DBC, but by the convexity of P3 they 
cannot to be placed both strictly inside DBC. Hence, one of them should coincide with 
BD. Thus there exists the point B” € Po corresponding to the point B € P3. 

Secondly, the edges BC and BD (B'C’ and B'D’) are edges of the same parallelohedron 
P; (P4), which are adjacent in the point B (B’). 

Let’s consider such a parallel translation A, that A(P;) = Po and A(Po) = Po. It is 
obvious, that A(P3) = Py. Thus, the parallelohedra P3 and Py have the common edge 
A’ B’” equal and parallel to the edge AB. 
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The triangle A’”’ D‘C is strictly situated in the band between the straight lines AB and 
A’ B’, that’s why an entire parallelohedron of SB cannot be contained in it. Hence, each point 
of A’” D’C belongs, at least, to one of the parallelohedra Po, P3 and Py. But the letter, if all 
three points A’”, D’ and C do not coincide among themselves, contradicts to the convexity 
of the polygons Po, P3 and P4. 

As the points D’ and C coincide, the points A and D” also coinside and also C” with A’. 
So the parallelohedron Po is a hexagon. 

The second statement of the theorem is obvious.e 


$4.3 The Main Theorems about parallelohedra 


After the completed now enumeration of all parallelohedra on the plane, in the two- 
dimensional case, there become obvious or almost obvious many theorems, which are 
fair (but are difficult or very difficult to prove) for parallelohedra of any dimensionality. 


Theorem of Minkowski n.2. Every parallelohedron is central symmetric. 
> See theorems 4.4 and 4.5.¢ 


Theorem of Venkov n.2. Every parallelohedron, which is abnormal, can appear as a 
normal parallelohedron. 


> See theorem 4.4.¢ 


Theorem of Voronoi n.2. Every primitive parallelohedron is affine equivalent to the 
DV -domain of some lattice. 


> According to theorems 4.4 and 4.5 each primitive two-dimensional parallelohedron is 
a central symmetric (convex) hexagon. We need to prove, that every hexagon of this kind 
is affine equivalent to inscribed one. 

At first, we shall prove, by the convenient method for us, the rather known fact, that 
it is possible to describe an ellipse around each hexagon of this kind. Let’s take four 
of the six vertices of our hexagon (pic. 4.09), which form the parallelogram ABDE and 
let’s transform this parallelogram into a rectangle, whose edges will be called “vertical” 
and “horizontal”. Our rectangle determines the set of ellipses, which limiting figures are 
horizontal and vertical bands determined by the edges of the rectangle. The images of the 
two remaining vertices of the hexagon lie by its convexity in one of these bands and select 
one ellipse from the constructed set of them. Now the auxiliary statement is obvious. 

On transforming this ellipse in a circle, we prove the theorem.e 


Note: The similar theorem is proved by Zhitomirskii also for the large class of non- 
primitive parallelohedra. And though two-dimensional non-primitive parallelohedra do not 
belong to this class, it is obvious, that each of them is affine equivalent to the DV-domain 
of some lattice. 

The following theorem is a kind of the theorem of uniqueness to the theorem of Voronoi 
as to the theorem of existence. 


Theorem MRS (Michel, Ryshkov, Senechal) n.2. Let a parallelohedron be affine 
equivalent to the DV -domain D of some lattice. Then, if this domain D ts not direct product 
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pic. 4.09 


of two (or several) DV -domains of lattices of smaller rank, it, correct to a transformation 
similitude, is unique. 


> At first, we shall note, that a rectangle is the direct product of two one-dimensional DV - 
domains (segments) and consequently the theorem is not formulated for it. The appropriate 
hypothesis is not correct, as there exists the continuum of hyperbolic turns, which transform 
a rectangle into rectangles with different ratioes of edges’ lengthes. So only the case of 
six-vertixes parallelohedra is to be considered. 

So, let us be given a central symmetric (convex) hexagon M, for which there exist two 
affine maps A and B in inscribed hexagons. We can, carring out a homothetic transformation 
if it is necessary, consider that these hexagons are inscribed in the same circle C. Let’s 
carry out the transformation AB~!. Thus one of inscribed hexagons will transform into 
another one, and hence, the circle, in which they are inscribed, will transform into itself. 
But the unique kind of the affine transformation of a circle into itself 1s a motion.e 


5 Density 


§5.1 Ray Sets. Mesuring Figures 


1°. Definition. A set M C E? is called a ray set with respect to a point O € M (centre), if 
for any point A € M the interval OA is contained in M (pic. 5.01). 


We shall suppose everywhere that, as soon as we are given a ray set, we are given its 
centre. 

We shall call aset M C E’ a strongly ray set with respect to O € M, if there exists such 
a neighborhood (radiance neighborhood) U of O, that M is aray set respectively any point 
Oo’ eU. 
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pic. 5.01 


Further, we shall consider ray sets of two kinds: bounded closed polygons not necessary 
convex (with centres, which do not lie on the straight lines containing its edges) and bounded 
closed convex figures (with centres, which do not lie on its boundary). No other ray sets 
are considered in this article. 

It is obvious, that all ray sets, which we consider, are strongly ray. 

We shall denote the set homothetical to M with respect to O through r M, wherer > 0 
is the coefficient of the homothety. 


Theorem 5.1 Let us be givena ray set M withacenter O. For an arbitrary number t > 0 
there exists such k, > 0, that for all k > k, inf p(X, Y,) does not depend on k, where 
X € fr(kM) and Y, € fr((kK+1t)M). 


> At first we shall note, that inf o0(X, Y;) is the infimum of a continuous function on 
a compact set. Let inf p(X, Y,) be reached on points A € fr(kM) and B, € fr((k + 
t)M). 

We shall proof the theorem (1) for convex figures and (2) for ray polygons separately. 

(1) Let M be a (bounded closed) convex figure. We shall note, firstly, that the whole 
segment AB, (except A) lies, obviously, outside of the figure kM. Secondly, the supporting 
straight line /’ to (k + t)M in the point B; is unique and it is orthogonal to A Bj, otherwise, 
see pic. 5.02, by moving the point B; we could have received a smaller distance. Thirdly, 
we Shall note that among supporting straight lines to kM at the point A there is the straight 
line / orthogonal to AB), otherwise, see pic. 5.03, kM would intersect / and we could have 
find in the figure kM a point closer to B,, than A. 

Let’s consider now the point A; € (k++t)M corresponding to A by the homothety. There 
exists the supporting straight line /; to (k + t)M init which 1s parallel to/. As the segment 
AA, lies outside of kM, the figures kM and (k + t)M are disposed outside of the strip 
between /; and/’. Therefore /; andl’ coincide. Thus p(A, A;) = p(i, /,) and this distance, 
obviously, does not depend on k and is linearly on f¢. 

(2) Here we shall distinguish “convex” and “concave” angles of a given polygon M, see 
pic. 5.04, 5.05. Let EF and E; F; and also FG and F| Gy, be pairs of edges of polygons 
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pic. 5.02 


(k+t)M 


pic. 5.03 


kM and (k + t)M which are homothetic to each other. The straight lines EF, EF; F,, FG 
and FG, we shall denote through /, /;, m and m, respectively. If k increase then lengthes 
of the mentioned four edges increase too (the points E and G and also E; and G; move 
off, accordingly, from the points F and F,). At the same time, the distances between / and 
1; (m and mj) do not change. From above it is clear, that for each t > 0 there exists such 
k, > O, that at all k > k,, the perpendiculars from F on /; and m, will be intersect the 
segments E; F; and F,G,, if the angle EFG is convex (pic. 5.04), or the perpendiculars 
droping from F; on/ and m will intersect the segments EF and FG, if the angle EFG is 
concave (pic. 5.05). At the expense of increase of k;, because the number of M’s vertices is 
finite, it is possible to achieve, that the described situation would be observed in all angies 
of kM. Further, we suppose that the number k, is already chosen large. 

Now, it is already clear, that 0(A, B,) = p(l’,1”), where /' and 1” are the straight lines, 
which carry a pair of homothetical edges of the polygons kM and (k + t)M, and this 
distance, obviously, does not depend on k and is linearly on t.e 

Further, we shall denote inf 0(A, B,) (not depending on k) at a given ¢t through d(t). 
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pic. 5.04 


Theorem 5.2 d(t) is a linear homogeneous function of t. That is d(t) = Xt, where i > 0 
is some constant which depends ona ray set M and on its centre. 


> See the proof of theorem 5.1. 

Let M be a ray set with the centre in a point O and M’ be a set parallel congruent 
(parallel) to it with the centre in O’ corresponding to O. We shall denote the set homoteti- 
cal to M(M’) with respect to O(O’) throughr M(rM’), wherer > Ois the coefficient of the 
homothety. 


Theorem 5.3 Let d > 0 be the distance from O up to O'. Then at rather large k > 0 for 
every d > O there exists t > 0 that does not depend on k, such that 


(k-—t)M CkM’ C(k+t)M. 


> Let’s take rather large k > O and A from theorem 5.2 and put t = 20/X. Then from 
theorems 5.1 and 5.2 we infer that each point of the set kM is removed from each point of 
E* \ (k + t)M by not less than on 2p. At the same time each point of kM’ is away from 
the point of kM, according to it, on the distance not greater, than p. From here it follows 
immediately, that kM’ Cc (k+t)M. 
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(k+t) M pic. 5.05 


We can obtain the inclusion (k — t)M Cc kM’ either by the similar reasonings or by 
replacing M on M’, M’ on M and k onk — t in the previous inclusion.e 


§5.2 Density, Its Independence on Begining and Boundary Effects 


1°. The word “functional” in this section means a numerical function of sets T C E?. 
Examples can serve such functionals: the functional F;(7T) identically equal to 1 on all 
T C EF’, functional Fy = d(T) equal to the diameter of 7, functional Fs = S(T) equal to 
the area of T. 

The domain of definition of some functionals are not all sets T C E* (for example, 
every nonmeasurable set isn’t contained to the domain of the functional Fs). However, in 
those places, where we not concretize a functional, we shall consider, that it is given on the 
considered sets by us. 

Further, all considered functionals are finite on bounded sets. 

We pay attention, that in all three examples the functional F is possessed of the following 
property F(T) > F(T"), with T C T’. This property is called monotony of F. 


Definition: Let us be given some locally finite disposition & of sets {7;} bounded in total 
(i.e. there exists a number C that for any 7;, Fa(7;) < C) and a functional F(T). Let us 
also be given a ray set M € E* with centre in O. The inferior density of the functional F 
with respect to M on the disposition & is 


~~ F(Tj) 


A= A(%) = ACY, F) = lim inf SGM)” 


where the summation is over the sets 7; Cc rM. 
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Definition: Let us be given some locally finite disposition T of sets {7;} bounded in total 
and a functional F(T). Let us be also given a ray set M € E? (measuring figure) with 
centre O. The superior density of the functional F respectively to M on the disposition & 


1S 
~ 5 _ — DF) 
A = Am(%) = Am(B, F) = py SM) 


where the summation is over the sets 7;, that 7; A rM £€ @. 


Definition: The density with respect to a given measuring figure M of a functional F on 
a disposition & is the number Ay (2, F) = Am (&, F), if Am (&, F) = Ay (&, F). 


Examples. For the functional F,(7) the (superior, inferior) density is the (superior, 
inferior) average number of elements of & respectively the set M. For the functionals d(T) 
and S(T) density is the average (respectively to M!) diameter or area of elements of the 
disposition. 

2°. Independence on beginning effects. 


Theorem 5.4 n.n. Let M be a strongly ray set with centre in O and M' be a parallel set 
to it with centre in O'. Let & be an arbitrary locally finite disposition of sets bounded in 
total and F(T) be a functional. Then the densities A, A (and A, if A = A) of F(T) with 
respect to the sets M and M", accordingly, coincide. 


> Let Ug be the radiance neighborhood of the set M. By replacing, if it is necessary, 
the sets M and M’ on homothetical to them ones with centers of homothety, accordingly, 
in the points O and O’, let’s assume that O’ € Ug. Let p > 0 be the distance between O 
and O’. We shall assume, that the embeddings from theorem 5.4 are carried out for some 
t > O, since kg > 0. We shall note now, that the distance between each point of M and the 
set M’ is not greater, than p, therefore for any k > ko from theorem 3.2 we have: 


(k-—t)M CkM' C(k+t)M 


Let’s consider sums: }°, F(7;) and SS F(T;). Further, in the proof of the theorem for A 
(A) the summations is over the sets 7; which there are in the definition of A (A). The first 
of these sums for the set M, the second for M’. 

From the mentioned above embeddings we have: 


271M) < Drm <) F(T) 


k+t 
further, we have: 
S(K=1)M) Vier FT) Ue FCT). SUK + NM) Vic FD 
S(kM) S((k — t)M) S(kM) S(kKM) S((k+1t)M) 
from here we obtain: 


(k= 0)" Vier FT) FCT) +" Veg FD) 
k2— S((k —t)M) S(kM) kk? S((k + t)M) 
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Taking in the case of A the superior limit and in the case of A the inferior limit, we finish 
the proof.e 
3°. Independence on boundary effects. 


Theorem 5.5 n.n. Let M bea strongly ray set with centre in O. Let also & be an arbitrary 
locally finite disposition of sets bounded in total and k > 0. Then 


r>oo S(rM) Hes S(rM) 


do not depend on that, which of T; intersecting close[{(r + k)M \ (r — k)M] or which its 
subsets (besides of all sets T; wholly lying in (r — k)M) are taken into account in the sums. 
These limits are equal, accordingly, to 
S(T; S(T; 
lim int 2ST) and lim ry SEE) 

roo S(rM) r>co S(rM) 
where the summation may be over T; C (r —k)M or over T;, thatT; 1(r +k)M & Q. 
Through S(rM) in all limits the area of rM is denoted. 


> We shall denote through > S(T;) the sum, in which the summation is over 7; C 
(r — k)M or also is over some arbitrary set of such sets 7;, that 7; OD (r +k)M € @ and 
subsets of such sets. 

Let d(7;) < d for each 7; € &, where d is some number. We shall consider such a 
number f, that (see theorem 5.2) d(t) > d. 

Let’s denote through )— S(T7;) the sum, in which the summation is over all J; C (r —k — 
t)M, and through )~” S(T7;) the sum, in which the summation is over all 7; C (r +k +1)M. 


We have: 
ys) sm) sa) 
S(r —k—t)M) ~ S((r)M) ~ S(r+k+t)M) 
On noticing, that the superior (inferior) limit of the right part is equal to the superior (inferior) 
limit of the left part, we finish the proof.e 

Meaning theorems 5.4 and 5.5, further, we shall use only strongly ray measuring figures, 
when calculating the density of a functional. 

4°. We shall adduce one example of rather general character. Let’s firstly note, that the 
proved theorems about the independence allow in this example not to care for the sets of 
a disposition &, which are near the boundary of kM and for a measuring figure M with 
correct to a parallel translation. (As for the form of the set M, we shall see, that the following 
situation is absolutly different.) 

We shall consider the plane E? determined by the rectangular coordinate system and the 
following disposition & of individual squares. The right half-plane (x > 0) is divided 
on such squares with whole vertices and the left half-plane does not contain internal 
points of any square of &. Let’s consider two strongly ray sets M and M’ with com- 
mon centre in the origin. We shall denote the set M as the rectangular with the vertices 
(—1, 1), (2, 1), (2, —1) and (—1, —1). M’ we shall denote as the rectangular with the ver- 
tices (—2, 1), (1, 1), GQ, —1) and (—2, —1). See pic. 5.06. It is obvious, that with respect 
to the first set A(@, Fs) = 2/3, and with respect to the second one A(%, Fs) = 1/3. 
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pic. 5.06 


So, we have seen, that the density of the same functional on the same disposition, but 
respectively various measuring figures, may appear various. 


Exercise 


Construct an example of a dispositon for which density with respect to a square differs 
from density with respect to a disk (both with centre in the centre of symmetry). 


§5.3 Periodic Disposition. 
1°. Here and in §4 we shall prove a number of theorems, allowing, in particular, to concern 
to the concept of density not so pessimistic, as it could be after the last example from §2. 
Let © be some locally finite disposition and M be a given measuring figure. As in §2 we 
shall denote the inferior(superior) density of { measured by M through A y, (2%) (Am (S)). 
If the disposition & has the density with respect to M, we shall denote it through Ay (‘). 
Further, we will need for several times the following design. Let’s take on the plane the 
square go with the centre in the origin O and with the edges parallel to the axes of coordinates, 
with length equal to |. Further, we construct the tiling (quadriliage) Q = qo ® Z’ of E?. 
(Through X © ¥ the vectorial sum or the sum of Minkowski is denoted here, i.e. the set of 
points determined by all vectors Ox + Oy, with x € X, y € Y and O 1s an arbitrary chosen 
point, in this case it is the origin.) 
We shall denote the quadriliage homotetical to Q with the coefficient A > O through AQ. 
2°. A disposition & is called periodic, if it can be represented as $* @ I’, where &* is 
some finite disposition and T° is a lattice. See pic. 5.07. 


Theorem 5.6 n.n. For each periodic disposition ¥ = &* @T there exists the density not 
dependent on a measuring figure. 
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O pic. 5.07 


> By making, if it is necessary, parallel translation, we can consider that the origin of T 
and one of points of some element of &* are placed in the origin O. Let’s take an arbitrary 
basic parallelogram Ig of the lattice I’ with the vertex in O and denote its diameter and 
area through d and S. We shall construct the (affine equivalent to the tiling Q) face-to-face 
tiling Q = No @T with parallelograms 14 = Ao,A4(Mo), where A € IT. We shall 
attribute the finite disposition 20 4 (&*) to the parallelogram I ,. 

Let &* = {T|, To,..., T,} and d* be the diameter of the set 7; U 7) U...U Tj. Let us 
fix a measuring figure M with centre in O. 


We shall show, that Ay (%) = LS) where the summation is over all 7; € &*. 


Really, let’s choose a number p > d + 2d*. Then, according to theorem 5.5, in the 
following expressions 


S(T; 
and lim up 2 
roo S(rM) 


S(T; 
lim inf 2 STi) 
roo S(r 

the sum )_ S(7;) is possible to be considered with all elements of T, attributed to those 
parallelograms of Q’ which lie in rM. This sum is equal to k )-* S(7;), where k is the 
number of such parallelograms. We have 


S(T; KY™ S(T; kS 
ime = pan 
roo §=69S(rM) r—>0o kS S(rM) 
* S(T; k§S 
—_ 2. SMTi) |, inf 


m 
S roo S(rM) 
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Accordingly to same theorem 5.5, we have lim inf;_, 95 tan = | andanalogically calculating 


lim sup,_, 66 ae , we obtain 


2) ip? Say 2, 
r->Oo S(rM) r—>oo S(rM) S 


But this magnitude does not depend on M.e 
As the separate remark we shall note once again the important fact, obtained in the proof 
of theorem 5.6. 


Remark: nan. For each periodic disposition 
= (11; Io,2205 Te} OF 


the density is equal to yee S(T;)/S() for any measuring figure. 


§5.4 The Densiest Packing and the Thinnest Covering Densities’ 
Independence on Measuring Figure 


1°. Packings. Let {7} be some family of sets with diameters not greater than the number 
d/2. In this item through { (may be with indexes) will be denoted packings of such sets 7; 
each of which is congruent (may be by a given set of ways) to one of the sets T € {T}. 


Theorem 5.7 For any measuring figure M, any € > 0. and for any packing Xo there exists 
such a periodic packing Sqm, for which the inequality Ay(2%o0) — Am(20M) < € is valid. 


t> We consider the figure M, the number 1 > € > Oand the packing 9 given to us. Let’s 
choose such a > 0, that the following inequality is valid 


I 
(2d —a < 36a". (5.1) 


Let’s consider the quadriliage aQ and choose (see theorem 5.5) p = a2 + d and such ro, 
that at all r > ro the next inequality is valid 


ka ! 


O< |] —- —— < - 
S(rM) 3 


> 


where k is the number of squares of Q containing in rM. We shall consider that for all 
r > ro the following condition 1s also valid 


S(Ti) | 
= < -¢€ 


Am (Xo) — S(pM) ~ 3° 


where the sum is taken over all elements of Zo, intersected with the mentioned squares 
(theorem 5.5 with d(t) = p). From here we have 


SU eka 24 
aa <= €. 
ka S(rM) 3 


Am (Xo) — 
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Further, as ka* < S(rM), we obtain 


> S(T;) 1 
———— EE 


Am (Xo) =~ baz 3 : 


Hence, among our k squares, by the Dirihlet principle, we can find, at least, one square (we 
shall denote it through Q), for which 


Am (Xo) — 


a < se (5.2) 


3 ° 
where the sum 5 ~’ is taken over all elements of {po containing in the square Q’, concentric 
to QO and homothetical to it with the coefficient (1 + f) (as well over all elements of Yo, 


intersected with Q). 
Moreover the next inequality is obvious: 


/ } 2 
Ss Su 2” (1456). 


a a 


We shall make necessary estimates using this inequality and also the inequalities (5.1) 
and (5.2): 
e580) 


. a >» S(T) a? 
Am (40) — Gaye Am (Xo) — 7 


BS Sh) (1-5) tists ee 
a 3 


< Au (Xo) — ae ae 


eit abe irae : ey € (5.3) 
<- — ~—e}<[-e+  -e€] =e. 

3 3 3 3 3 

Let’s construct the packing Soy. (We shall note, that the family {7} may be reduced.) For 
this purpose we shall take the quadriliage Q’ = (a + d)Q and replace in it each square with 
the square Q/ together with that part of the packing 9, which is contained in Q/. According 


to theorem 5.1, the packing Zo has the density Ay (Xo) and this density satisfies, by 
(5.3), the demanded inequality.e 


Theorem 5.8 Whatever measuring figures M and M’ were for packings of an aspect %, 
the following equalities take place 


sup A y(Z) = sup Ay (Z) = sup Ay (Z) = sup Ay (B). 


(The greatest possible density of packings of an aspect S does not depend on a measuring 
figure.) 


> Let for the given measuring figure sup A y () be not attained. Let's choose € > 0 and 
such sequence of packings {ZT ;} with 7 = 1,2,..., thatsup Ay (2) — Ay (Bj) < DIV. 
Replacing in theorem 5.6 € on 2~/~'e(j = 1,2,...), we shall obtain the sequence of 
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periodic packings {% ;4}, for which the density (and by that, the inferior density) satisfies 
to inequalities 
sup Am (Z) — Am(Zjm) < 27/e. (5.4) 


Therefore the first and the third inequalities from the confirmed chain of inequalities are 
proved. 

Further. As all 2 jy are periodic, their densities do not depend on a measuring figure, 
i.e. for any other measuring figure M’ there is the equality Ay’ (Sjm) = Am(Zjm). The 
same constructions may be carried out for sup Ay (%) and it is possible to construct the 
appropriate sequence & jy’, for which Ay (Z jy) = Ay’ (Ljm’). From here it follows, at 
once, the second inequality from the confirmed chain. 

Let for the given measuring figure sup Ay (%) be attained for some packing Zp. The 
sequence & jy, which satisfies to inequalities (5.4), in this case is constructed directly on 
the packing {%o, and all further reasonings are kept.e 

2°. Coverings. Let {7} be some family of sets whose diameters are not greater than d. 
In this item through T (may be with indices) will be denoted an arbitrary (locally finite) 
covering by sets 7;, each of which is congruent (may be by a given set of ways) to one of 
the sets T € {T}. 


Lemma 1 For any measuring figure M inf Ay(S) is finite. 


> Let’s pay attention on the fact, that no sets of any locally finite covering and thus no 
sets JT © {T} can have the zero area. Let’s consider a set JT € {7} such that S(T) > 0. 
As the set 7 has the positive internal measure, there will be discovered a square g with the 
edges, parallel to the axes of coordinates. We shall denote the length of the edge through 
€ and consider the quadriliage «Q. We shall place T in such a way, that the square qg has 
coincided with the square ego of €Q. The periodic packing T @ €Z* has the density not 
greater than 


S(T)/e? > inf Ay(S).« 


Theorem 5.9 For arbitrary measuring figure M, € > O and for an arbitrary covering Xo 
there exists sucha periodic covering Som, for which the inequality Am (20m)—Ay (2) < € 
is valid. 


> We consider the figure M, the number 1 > € > O and the covering Yo given to us. 
Let’s choose such a > QO, that the following inequality is carried out 


a —(a—ay <5] 


a Qos - Z 
nwo d)* =a(a—d)’. (5.5) 


Let’s consider the quadriliage aQ and choose (see theorem 5.5) o = av/2 and such ro, that 
at all r > ro the next inequality is valid 


ka 
< — 
S(rM) 


l<a, 
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where k is the number of squares of a Q intersecting rM. We shall consider, that for r the 
following condition is also carried out 


>» S(Ti) 


S(rM) — Ay (Xo) < @, 


where the sum is taken over all elements of 39, intersected with the mentioned squares 
(theorem 5.5). From here we have 


y-S(T;)) ka? 


ite | 
ka? SirpMy SM G0) < 


Further, as ka” > S(rM), see (5.7), we obtain 


>» S(Ti) 


~~ Au (Eo) <a 


Hence, among our k squares, by the Dirihlet principle, we can find, at least, one square (we 
shall denote it through Q), for which 


I's T, 
2, 8) > ) — Ay (80) < a, (5.6) 


where the sum )~“’ is taken over all elements of To containing the square Q. This elements 
cover the square Q’, concentric to Q and homothetical to it with the coefficient (1 — f), 
We shall make necessary estimates using (5.5) and twice (5.6): 


y S(T) s(t) a? 
@eaap A y (Xo) = = @=ay =a — Ay (So) 
< 2D aaGILT +a)—Ay(SZo) <a rare See) ae. 
a a 
<a+aAy(%o) +a) < 32+ 56) < €, (5.7) 


Let’s construct the periodic covering Soy. (We shall note, that the family {7} may be 
reduced.) For this purpose we shall take the quadriliage Q’ = (a — d)Q and replace in 
it each square with the square Q7 together with that part of the covering Zo, which is 
contained in Q/. According to theorem 5.6, the covering Soy has the density Ay (Zom) 
and this density satisfies, by (5.7), the demanded inequality.e 


Theorem 5.10 Whatever measuring figures M and M' were for coverings of an aspect &, 
the following equalities take place 


inf Am (S) = inf Ay, () = inf Ay, (S) = inf Ay (S). 


(The least possible density of coverings of an aspect & does not depend on a measuring 
figure.) 
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> Let for the given measuring figure sup A ,,(&) is not reached. Let’s choose € > 0 and 
such sequence of coverings {&;} with j = 1,2,..., that Am()—inf Ay (Li) < Ie 
Replacing in theorem 5.4 € on 27/~'e(j = 1,2,...), we shall obtain the sequence of 
periodic coverings {% ;}, for which the density (and by that, the superior density) satisfies 
to inequalities 


Am(Sjm) — inf Ay(S) < 27/e. (5.8) 


Therefore, the first and the third inequalities from the confirmed chain of inequalities are 
proved. 

Further. As all & jy are periodic, their densities do not depend on a measuring figure, 
i.e. for any other measuring figure M’, the equality Ay’ (2 jm) = Am(Zjm) holds. The 
same constructions may be carried out for inf Ay/(&) and it is possible to construct the 
appropriate sequence & jy’, for which Ay (ZL jy’) = Am (Ljm’). From here it follows, at 
once, the second inequality from the confirmed chain. 

Let for the given measuring figure inf A,,,(&) is reached on some covering Yo. The 
sequence & 5, which satisfies to inequalities (5.8), in this case is constructed directly on 
the covering Yo, and all further reasonings are kept.e 


6 Packing of Equal Balls. Covering by Equal Balls 


$6.1 Results. Elementary Theorems 
1°. Here we adduce the result of two classical problems of the discrete geometry for the 
case of the plane, using only the concept of L-tilings and simple theorems of the elementary 
geometry. Use of theorems from §5 facilitates the reception of the result and a little specifies 
it, generalizing on arbitrary measuring figures. 


Theorem 6.1 2.2. For any packing of equal disks on the plane (and whatever a measur- 
ing figure there was) the density (superior density) of a packing cannot be greater, than 
m/V12 = 1/2V3. 

Such density, with respect to any measuring figure, is reached on the “hexagonal” lattice 
packing. See pic. 6.01. 


Theorem 6.2 2.2 For any covering by equal disks of the plane (and whatever a measuring 
figure there was) the density (inferior density) of a covering cannot be less, than 21 /3V3. 

Such density, with respect to any measuring figure, is reached on the “hexagonal” lattice 
packing. See pic. 6.02. 


In the proofs of theorems 6.1 and 6.2 we shall fix on the plane an arbitrary measuring 
figure M with centre in O (see §2 of chapter 5) and we shall consider figures of an aspect 
XM, where d is a real number. 

2°. The proof will be preceded by a number of definitions and lemmas. 
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KA A AL 


Vy VY 1S) 


Lemma 1 For any a;, Bj > 0, withi = 1,2,..., N, the following inequalities take place: 


7 
ZN 


@, ta2+-:-+an 
)= ay RIT Fa > min(a;/B;).> e 


max (a; /B; 
Lemma 2 Among triangles with edges a,b,c > 2r and angles less than 120°, the least 
area is equal to r*/3, of the regular triangle with the edge 2r. 


> Let Zab be the greatest of the angles of the considered triangle (see pic. 6.03), then 
120° > Zab >60°. We have 


l 
a ee sinZab > 2r? sin Zab > 72/3, 
where the equalities are carried out, accordingly, only ata = b = 2r and Zab=60°.e 


Lemma 3 Among triangles enclosed in a disk of radius r the greatest area, equal to 
3r?./3/4, has the inscribed regular triangle. 


> Let some triangle ABC be enclosed in a disk of radius r, with the boundary (circle) C, 
see pic. 6.04. May be, by making a motion of the triangle, we may consider that A € C, 
and by making, if it is necessary, a homothety with the centre in A (and increasing, by that, 
the area of the triangle), we may consider that one more its vertex, let it be B, belongs to the 
circle. If after this the triangle ABC is not inscribed in C, then we shall find the intersection 
(point C’) of the straight line AC and the circle C and obtain the triangle ABC’ with greater 
area and already inscribed in the circle. For this case our theorem is trivial.e 
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pic. 6.02 


§6.2 The Densest Packing of Equal Disks 


1°. Definition: A packing { = {7), T>,..., Ty, ...} of pairwise congruent sets in E? is 
called saturated, if for any set T C E? congruent to 7; there exists such a set 7; C &, that 
TAT 4 @. 


Below, in §2, we shall consider only packings of pirewise congruent disks and assume, 
that through &(%) is denoted the system of the disks’ centres. 
Let’s note, that for a saturated packing & the system &() is a Delone system. 


Lemma 4 Whatever was an L-polygon in the L-tiling of u(&), with an arbitrary saturated 
packing %, each its angle is less, than 120°. 


> Assume the contrary. Let in a point M € & some L-polygon L, has an angle 2M > 
120°. The edges M P and M Q of the polygon L, have length not smaller, than 2r (you see, 
x is the system of the centres of packing’s disks of radius r). From these circumstances 
it follows that the circle circumscribed around AM P Q and thus around the polygon Ly, 
has the radius R > 2r. Thus, we have obtained the empty disk of radius R > 2r, but it is 
impossible, because Y is a satureted packing.e 


Corollary: Jn the conditions of lemma 4 only triangles, quadrangles and pentagons can 
be L-polygons. 


> Each inscribed n-angel, with n > 5, has, on extreme, one internal angle greater, than 
120°. 

We shall note also, that in the conditions of lemma 4 neither length of an edge, nor length 
of a diagonal of an L-polygon can exceed 4r. 
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pic. 6.03 


Lemma 5 /n the conditions of lemma 4, every L-triangle is intersected only with those 
disks of the packing whose centres are located in its vertices. 


> Each point of & is covered together with their neighbourhood only with “its own” disk. 
Therefore for us it is enough to show that in our conditions the circle of radius r drew from 
an edge of a triangle does not intersect the edge opposite to it. Let the triangle has the edges 
a, b,c, the area S and radius of circumscribed circle R. We shall calculate its height hg. 
We have abc = 4RS. From here, taking into account, that R < 2r anda,b,c > 2r, we 
obtain 
Hg = 2S/a = bce/2R > b/2 > r.e 


For further, for such L-tilings let’s divide each quadrangle and pentagon by its diagonals, 
with one of its ends in any one vertex, on two and three triangles, accordingly. It is obvious, 
that everything told above about L-triangles is fair, as well for the triangles obtained in such 
a way. 

2°. The proof of theorem 6.1 

> Let’s take an arbitrary packing & of disks of radius 1, which have as centres the point 
system (2). If this packing is not satureted, we shall supplement it with the same disks 
up to satureted. The density (superior density) of the packing, whatever a measuring figure 
was, will not decrease because of it, so we can assume at once that the packing © 1s satureted. 
Hence, we can consider the system &(%) as a Delone system. 
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B' 


A C 


pic. 6.04 


Let 6,4 (%) be the common area of all disks of &, which have got in the figure 1M, 
and are divided on the common area of all triangles of {L} 5), which have the nonempty 
intersection with AM. We shall denote the area of an L-triangle through Sj, the area of the 
Y-disks’ sectors, which have got in it, through Si; and we shall denote the summation over 
all L-triangles, which intersected 27M, through X,. See pic. 6.05 (for simplicity in this 
picture the packing is not saturated). 

We shall estimate the density of the packing by applying lemmas 1, 5 and 2: 


Sam(Z) < Yo Scir / S > Str < max(Scir/ Sir) 
L L 
< 5/3 — EN, 12. 


As, according to theorem 5.2, it is possible to neglect boundary effects, we have A M(%) < 
w/V¥ 12. 
The right part of the inequality does not depend on M, so we have 


A(S) < n/V12. 


As the hexagonal packing is periodic, the second statement of the theorem is obvious.e 


§6.3 The Thinnest Covering by Equal Disks 
1°. Lemma 6. Let radius of the circle circumscribed around a triangle ABC be not more 
than r. Let also Sa, Sp, Sc be the sectors of the disks of radius r with the centres in the 
points A, B, C accordingly, determined by the straight lines AB and AC, BA and BC, CA 
and C B. Then the triangle ABC is covered by the sectors Sa, Sg, Sc. See pic 6.06. 

> For the proof, it is enough to consider three figures formed by perpendiculars, dropping 
from the centre of the circumscribed circle of ABC to its vertices, and by halves of the 
edges of ABC.e 

2°. The proof of theorem 6.2 
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> 


pic. 6.05 


> Let’s take an arbitrary covering % of the plane IE’ by disks of radius 1, which have the 
point system ©(f) as centres. Thus the system X(%) has Ryq) < |. And it has no limit 
points, otherwise the covering & is not locally finite. We shall estimate the density in a 
bounded domain of an aspect AM, therefore there will be discovered in it the radius pe of 
the discreteness for the system &(). By replacing outside of AM the system 2 (‘L) on an 
arbitrary Delone system, which has Ry < 1, we shall obtain the Delone system % = 2). 
For each system &, we can construct the L-tiling {L},. If A < yw, the tilings {L}, and 
{L},, do not differ inside (A — 1)M. We shall further denote through {ZL}, only this part 
of the first tiling . By subdividing, as well as above, each L-polygon on triangles, we shall 
consider {L}, as a complex, which consists of triangles only. 

Let’s consider an arbitrary triangle Lo € {L},. Let S,, be its area. The sectors which are 
cut out by the angles of the triangle Lo from the disks of &, which have the vertices of Lo 
as the centres, according to lemma 6, cover whole this triangle. Let S.j, = 2 be the sum of 
these sectors’ areas. Through © we shall denote the summation over all such triangles. 

Let o,(%) be the common area of all disks of {, which have a nonempty intersection 
with the figure A.M, and are divided on the common area of all triangles of {L}). 
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We shall estimate the density of the covering, by applying lemmas 1, 5 and 2: 


oym (3) = oa Suc / a Str = min(Scir/ Str) 
L L 
/ (3.3/4) = 20 /3V3. 


IV 


As, according to theorem 2.2, it is possible to neglect boundary effects, we have Ay (SZ) > 
2n /3/3. 


The right part of the inequality does not depend on M, therefore we have 
A(S) > 21 /3V3. 


As the hexagonal covering is periodic, the second statement of the theorems is obvious.e 
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Exponential Diophantine Equations Involving Products 
of Consecutive Integers and Related Equations 


T.N. Shorey 
Dedicated to Professor Alan Baker on his 60th birthday 


0 Introduction 


This paper contains an account of the results on the following topics: 


. Squares in products from a block of consecutive integers 

. Equal products of consecutive integers 

An equation of Goormaghtigh 

An equation of Nagell-Ljunggren 

. Equal products of integers in arithmetic progressions 

. The greatest prime factor of integers in arithmetic progression 

. Cubes and higher powers in products from a block of consecutive integers 
. Perfect powers in products of integers in arithmetic progression 


ODIDNRWN 


It is not our intention to give a historical survey. We have included 112 references and 
we hope that they will provide an access to the related results which are not mentioned in 
this paper. In Section 1, we have described the method of Erdés and Rigge that a product 
of two or more consecutive positive integers is never a square. A sketch of developments 
on the method of Erdos and Rigge is also included. In Section 2, we have explained an 
extension of Runge’s method to exponential diophantine equations by giving a proof that 
for a given integer m > 2, there are only finitely many instances when a product of k con- 
secutive positive integers is equal to a product of mk consecutive positive integers. In the 
next section, the above method is combined with the theory of linear forms in logarithms for 
applying to an equation of Goormaghtigh. The equation of Goormaghtigh asks for integers 
whose digits are all equal to one with respect to two distinct bases. In Section 4, we give 
a survey of the results on the equation of Nagell-Ljunggren asking for perfect powers in 
integers with all the digits equal to one. It is shown that abc conjecture implies that the 
equations of Goormaghtigh and Nagell-Ljunggren have only finitely many solutions. We 
apply the latter equation and its particular case (30) with m even to show that certain num- 
bers considered by Mahler are irrational. Solving (30) with m even is equivalent to solving 
simultaneously a Nagell-Ljunggren equation and an equation of Catalan. The equation 
of Catalan is a particular case of the equation of Pillai. Thus we turn to considering the 
equations of Catalan and Pillai in Section 4. As an extension of the theorem of Schinzel and 
Tijdeman on hyper-elliptic equations in the theory of exponential diophantine equations, we 
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formulate a conjecture which includes a well-known conjecture of Pillai on his equation. 
The formulated conjecture follows from generalised abc conjecture. The Section 5 is a 
continuation of Section 2. It gives an account of developments in the extension of Runge’s 
method to exponential diophantine equations on equal products in arithmetic progressions. 
The Section 7 explains an elementary method of Erdos by proving that there are only finitely 
many possibilities when a product of two or more consecutive positive integers is a cube 
or a higher power. We also consider a more general question on an upper bound for the 
number of integers in a block of consecutive positive integers such that their product is a 
cube or higher power. The treatment to this more general question is no more elementary. 
The Section 8 gives analogues of the results of Section 7 for arithmetic progressions. An 
assumption gcd (n, d) = 1 inaresult on equationn(n+d)---(n+(k—1)d) = by has been 
relaxed to d Jn in Section 8. Further we show that abc conjecture implies that k is bounded 
by an absolute constant whenever the preceding equation with gcd (n,d) = 1 and/ > 3 
holds. The Section 6 is concerned with improvements and extensions of a well-known theo- 
rem of Sylvester that a product of k consecutive positive integers > k is divisible by a prime 
exceeding k. It gives estimates on particular cases of equations considered in Sections 1, 
7, 8 and the results of Sections 7, 8 imply some estimates related to the ones considered in 
Section 6. The paper points out crucial basic tools and it suggests several problems for fur- 
ther investigations. It aims at updating the book of Shorey and Tijdeman [95] on exponential 
diophantine equations written in 1986. An article with similar intention was already written 
by Tijdeman [109]. The proofs of several results of this book depend on its Theorems B.3 
and B.4 due to van der Poorten on p-adic linear forms in logarithms. The proofs of van der 
Poorten of these theorems have turned out to be incorrect. But correct proofs have been 
established by Yu [111], [112]. Hence all the consequences of Theorems B.3 and B.4 in 
the book are valid. All the constants appearing in this paper are effectively computable. 
Unless otherwise specified, the results mentioned in this paper are effective. For an integer 
v > 1, we write P(v) and w(v) for the greatest prime factor and the number of distinct 
prime divisors of v, respectively, and we put P(O) = P(1) = 1, w(O) = w(1) = 0. 


1 Squares in Products from a Block of Consecutive Integers 


Erdos [22] and Rigge [63] proved in 1939 that a product of two or more consecutive positive 
integers is never a square. Further Erdos and Selfridge [27] proved in 1975 the analogous 
statement for cubes and higher powers. Therefore a product of two or more consecutive 
positive integers is never a power. The first contribution in this direction dates back to 
1724 when Goldbach, in a letter to D. Bernoulli, showed that a product of three consecutive 
positive integers is not a square. The theorem of Erdés-Rigge is equivalent to saying 
that equation 


nn+l1)---a+tk—-l=y? in integersn > 0, y>0,k > 2 (1) 


has no solution. First we give a sketch of the proof that (1) has no solution whenever k 
exceeds a sufficiently large absolute constant. 

If n < k, we refer to Bertrand’s postulate to find a prime p Satisfying n < nik <p< 
n-+k—1 and dividing the left hand side of (1) to the first power. Therefore (1) implies that 
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n > k. Now we apply a well-known theorem of Sylvester [105] to find a prime g exceeding 
k and dividing the left hand side of (1). Consequently there is unique i withO <i < k such 
that g|(n + i). Furthermore we derive from (1) that n + i is divisible by g*. Hence 
n+k>nes+i > qq? > (k+1) 
which implies that 
n> k?. (2) 


We assume that k exceeds a sufficiently large absolute constant so that all our estimates 
are valid. By (1), we write 


n+i =aj;x? for O<i<k 


where a; is square free and P(a;) < k. The inequality (2) implies that ay, aj, ..., Ax— are 
distinct and this is fundamental for the proof. Let a; = a; withi > 7. Then 


2 2 
aj (Xi — Xj) (Xj + X;) 
1/2 


k>i-j 


> 2n!/? (3) 


IV 


2 
2ajxj = 2(ajx;) 


which contradicts (2). The proof depends on comparing a lower and an upper bound for 


Fe —— oer) eo Pema # | ea 
Since dy, aj, ..., Ag—] are distinct, itis clear that 
A>k!. 
This can be sharpened to 
3\k 
A> (5) k!, (4) 
2 
by using that ay, a,,..., @,x—, are square free. Now we turn to an upper bound for A. Ifa 


prime p divides a; anda; withi > j, theni = j (mod p). Therefore, since a;’s are square 
free, we have 


ord, (A) < H + 1. 
Pp 


Thus 
A 


k!] | p. 


pk 


In particular 
A<k]] p<ki3*. 
pk 
Our intention is to find an upper bound for A less than the lower bound (4). The above 
upper bound is not sufficient for this purpose and we must sharpen it. Let g € {2,3} and 
we write 
8q = ordg(A), hg = ordg(k!). 
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In fact we have 
ale] p 282-h2383—h3 | 
psk 


Since a; is square free, a prime divides a; if and only if it divides n + i to an odd power. 
Thus one can imagine that gg is much smaller than hg. This is the case: 


po a 

Pa ae log 2 

and : reste 
og 
—h3<--+2 

§3 35 A + og + 

We obtain 
A < k13k2~24/33-K/4 3644, (5) 


Finally we combine (4) and (5) to obtain 
k 3 149/341/4 : 4 
(1.04)* < eG 3 < 36k 


which is not possible if k is sufficiently large. Hence (1) has no solution whenever k exceeds 
a sufficiently large absolute constant. 

We always write b for a positive integer such that P(b) < k. Let dy,...,d; witht > 2 
be distinct integers in [0, k). We consider 


(n+ d\)--+(n+d;) = by’. (6) 
Equation (6) with t = k and b = 1 coincides with (1). It is natural to suppose that the left 
hand side of (6) is divisible by a prime exceeding k. Then Saradha [67], [68] showed that 
equation (6) witht = k andk > 3 has no solution other than the one given by ( ie = 140°. 


3 
This includes a result of Erdos [23] of 1951 on (6) with t =k andb =k! Le. 


n+k ae 
oe ae 


The assumption k > 3 is necessary in the above result of Saradha; equation (6) with 
t = k = b = 2 has infinitely many solutions. Now we consider the question of relaxing the 
assumption t = k. Erdés [25] observed in 1955 that the proof given above allows to show 
that there exist absolute constants C, and C> such that (6) with 


n>k?,t>k—-C 


log k ) 


implies that k < C2. The assumption n > k? is satisfied if the left hand side of (6) is 
divisible by a prime exceeding k. Saradha [67] showed that equation (6) with n > k?, 
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(.0156)k 


t>k-— lorry ae implies that k < 870. It is easy to see that the above proof fails if (7) is 
replaced by 
kf (k 
n> k?, t>k— fi) 
log k 


with f(k) — oo ask — oo. Therefore we should look for replacing the idea of saving 
powers of 2 and 3 in the proof given above by a different argument and it was found by 
Shorey [84] in 1986. For e« > 0, Shorey [87] applied in 1987 Baker’s theorem on integral 
solutions of hyper-elliptic equations and Sieve-theoretic arguments to show that (6) with 


log log k 


hei. t =k] ek 
log k 


(8) 


implies that k is bounded by a number depending only on €. This was asked by Erdos [25] 
in 1955 when he proved the analogous result for higher powers. We shall explain the proof. 
Before this we continue describing further results on (6). Balasubramanian and Shorey [7] 
obtained a further relaxation of the assumption (8), namely, 


n>k*,t> wp (9) 


where 


loglogk — log log log k Oo 
ee = kyl - —— + — 
log k log k log k 


for some absolute constant 69. We fix 69 and write 
F(k) = k(log k)/log log k for k > 3. 


We consider (6) with 


e!-%+6 Fk) <n < k?. (10) 
Then 
P(n+d;) <k forl <i <t. 
For every prime p < k, we choose an f(p) € {n+ d),...,n + d;} such that p does not 


appear to a higher power in the factorisation of any other n + dj with | <i < t. Then, by 
an argument of Erdos [25], we have 


D_-T eS ier > og 
nh" <T](ant+d)<[]p’ P <k 
psk 


which, by (10), implies that t < jx. Here the product []™ is taken after omitting all f(p) 
with p < k. The above argument has turned out to be basic and we shall apply it again 
in Sections 6, 7, 8 where we refer to it as ‘a fundamental argument of Erdés’. Further we 
observe that the assumption (9) can be replaced by 


n> e!—%+€ E(k), t > pg. 
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On the other hand, Balasubramanian and Shorey [7] showed that the preceding assumption 
is close to the best possible. More precisely, they [7] proved: Let 


e€>0,k >3andn <e !-’~”~€ F(k) 


where y denotes the Euler’s constant. For k > ko = ko(€), we can find distinct integers 
d\,...,d; € [0,k) with t > py such that (n + d,)---(n + d;) is a square. Now we give 
a Sketch of the proof of this assertion. For a set T of positive integers, we write w(T) for 
the number of distinct prime divisors of all elements of 7. Then it is well-known that there 
exists a subset T’ of T with 

IT'| => |T| —w(T) 


such that the product of all elements of T’ is asquare. We take T for alln+i withO <i <k 
such that P(n+i) < k and we write 7; for the complement of T in {n,n+1,...,n+k-—1)}. 
It is enough to show that 
IT | > we + 1(k) 
1.e. 
IT;}| <k—p, — Wk). 


It is easy to check that 


y k —1 — | 
1<A<(n+k—-1)/k 


The proof is completed by applying deep results on difference between consecutive primes 
for obtaining the desired estimate. 

As promised, we turn to giving a sketch of the proof that (6) with (8) implies that k is 
bounded by a number depending only on €. We may assume that k exceeds a sufficiently 
large number depending only on e. By (6), we have 


n+d; =a;x? forl| <i<t (11) 
where a; and x; are positive integers such that P(aj) < k and a; 1s square free. We put 
SS (Gia as Gp}: 
As earlier we observe from n > k? that |S| = t and 


fiptcet | ape Sk). 
ps<k 


Then there exists a subset S; of S such that 
[Si] = €k/2 (12) 


and 
aj < k(log k)'~«/” if aj € Sy. (13) 


Exponential Diophantine Equations Involving Products 469 


Further we notice from (11), (8) and (13) that 


1/4 


x; >k if a; € Sj. (14) 


We write S> for the set of alla; € S; such thata; < 3k and we denote by $3 the complement 
of Sz in S,. We split the proof in two cases: 


Case I |S2| < €k/4 


Then we observe from (12) that 
|S3| = €k/4. (15) 


Further we use a; > 3k for showing as in (3) that the products 
aja; witha; € S3,aj; € S3 
are distinct. This property is restrictive; Erdos [25] showed by Sieve that 
|S3| < 2G/logG (16) 
where G is the largest element of $3. By (16) and (13), we have 
|S3| < k(log ky" 
which contradicts (15). 
Case II |S2| > €k/4 


We start with three distinct elements of Sy. By permuting d|,..., d;, there is no loss of 
generality in numbering them as a), a2 and a3. We write from (11) that 


anxs = ax? + (dy —d)), a3x% = a,x? + (d3 — d)). 


We put 
a= gcd (aj, a2,a3), bj = a~'a; forl <i <3 
and 
R=a'(d) —d,), R’ =a '(d3 — d}). 
Then 


byb3(x2x3)° = (bix? + R)(by x7 + R’). 


Thus we have arrived at a hyper-elliptic equation. For applying results on integral solutions 
of hyper-elliptic equations, we must have control on the magnitude of b;, bz and b3. This 
is provided by a Sieving argument due to Erdés. In this case, we can find three distinct 
elements of S> such that b; , b> and b3 are bounded. Now we apply a theorem of Baker [3] to 
derive that max(x1, x2, x3) is bounded which, together with (14), implies that k is bounded. 
Hence we conclude the assertion that (6) with (8) implies that k is bounded by a number 
depending only ¢. The proof of Baker’s theorem depends on his theory of linear forms in 
logarithms. This deep theory finds applications at several places in this paper and we refer 
to Baker [4] for an account. 
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2 Equal Products of Consecutive Integers 


We consider (1) with the left hand side replaced by two blocks of consecutive integers. 
More precisely, we consider the equation 


XX 1X EK = DYE SD ee 4K = 1) = Z-. 


where K > 2, K’>2,K+K'>S5andX > Y+K’. Ithas been conjectured (see [26]) that 
this equation has only finitely many solutions in all the integral variables X > 0, Y > 0, 
Z > 0, K and K’. This conjecture implies that 


x(x+1)---@+k-D=yoyt+)-:---Ww+k+el-1 


has only finitely many solutions in integers x > 0, y > 0,k => 2 and @ > O satisfying 
x >y+k-+ 2. More generally, for relatively prime positive integers A and B, Erdos [26] 
conjectured in 1975 that there are only finitely many integers x > 0, y> 0, k > 3, €>0 
withx > y+k-+ £ satisfying 


Ax(x + 1)---(x+k—-1) = By (y41):::-(ytk+é- 1). (17) 


All the solutions of (17) have been determined for several values of (A, B,k, €) and we 
refer to [91, p. 239] for an account of these results. The first result in this direction is due 
to Mordell [48] that (17) with A = B = 1, k = 2, € = 1 implies that x = 2, y = 1 and 
x = 14, y =5. Fora given (A, B,k, 2), Beukers, Shorey and Tijdeman [11] confirmed 
the above conjecture of Erdos. For the proof, they showed that the underlying curve is irre- 
ducible and it has positive genus. Then the assertion follows from a theorem of Siegel [104] 
on integral points on curves but it provides no explicit bound for the magnitude of the solu- 
tions. In fact they determined all (A, B, k, £) for which the curve has genus one and (17) has 
only finitely many rational solutions for all other (A, B, k, £) by a theorem of Faltings [28] 
on Mordell’s conjecture. In particular, they [11] proved that (17) with A = B = 1 has 
only finitely many rational solutions. These results are non-effective in the sense that they 
provide no explicit estimate for the heights of the solutions. If £ = 0, Shorey [82] confirmed 
the conjecture of Erdos when x — 1 and y — 1 are composed of fixed primes. Saradha and 
Shorey [69] extended this result to all 2 > 0 and the proof depends on the theory of linear 
forms in logarithms. Further Saradha and Shorey [69] showed that (17) withx > y+k+€ 
implies that 
x-ye C3x7/ : 

for certain number C3 > 0 depending only on A and B. It is a difficult problem to confirm 
the above conjecture in general. Even a very particular case A = B=1, k =2andy=1 
of (17) is an open problem. We consider (17) with A = B = 1 and k + £& an integral 
multiple of k. For an integer m > 2, we re-write (17) in this case as 


x(x+1)---(x+k—1) = y(y+1)---(y+mk — 1) in integers 
x>0,y>0,k => 2. (18) 
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MacLeod and Barrodale [40] considered (18) with m = 2. If m = 2, equation (18) has a 
solution given by 
8.9.10 = 6! 

1.e. 

CSS, vel. k= 3: (19) 
MacLeod and Barrodale [40] observed that (19) is the only solution of (18) with m = 2 
and k < 5. Saradha and Shorey [69] showed that (19) is the only solution of (18) with 
m = 2. Further Saradha and Shorey [70] proved that (18) with m = 3, 4 has no solution 
and Mignotte and Shorey [47] verified it for m = 5, 6. In general Saradha and Shorey [71] 
showed that (18) implies that 

max(x, y,k) < C4(m) (20) 


where C4(m) is a number depending only on m. We have not been able to replace C4(m) 
by an absolute constant. It is likely that (18) with m > 3 has no solution. 

Now we give a sketch of the proof of (20). There is no loss of generality in proving (20) 
for integers x > 0, y > Oandk > 2 satisfying 


(x+1)---(« +k) =(y41)---(y+mk). (21) 


By counting the powers of 2 on both the sides of equation (21), we find that x and y are 
large as compared with k. In fact 


x= ON aay 
We write 
mk 
(Z+1)---(@+mk) = OAj(m, ky, (22) 
j=0 


Further we determine rational numbers 


B; = Bj(m,k) forl<j<m 


such that 
mk . 
(2 + By! +--+ BE = S Hj Gn, Ei (23) 
j=0 
satisfies 
H;j(m,k) = Aj(m,k) forO< j <m. (24) 
By (23) and (24), 
KB = Ai(m,k), 
k\ 22 
kBo+(,)B} = — Ar(m,b, 


eee @e @ © © © eo © © © © © we © © © 68h el ele le 


Therefore B,, ..., Bm are determined recursively. 
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Let z be a sufficiently large number as compared with k and m. The relations (24) imply 
that the left hand side of (22) is close to the left hand side of (23). Therefore the k-th root 
of the left hand side of (22) is close to the k-th root of the left hand side of (23). We use 
this observation with z replaced by x and y, since x and y are large as compared with k and 
m. The k-th root of the left hand side of (21) is close to x + At and the k-th root of the 


right hand side of (21) is close to y” + B,y”~! +---+ Bm. Consequently we derive that 
x+ att is close to y” + Byy"™~!+.-.+ B,,. In fact, we show that 


he k+1 


where 
t = (2 Icm (den (B)),..., den (Bn)))~!. 


Therefore 

x= y" + Bry"! +--+ + Bm — ——. 
We substitute this relation in (21) to get 

H;(m,k) = Aj(m,k) for O<j < 2m. 


Thus we have added m — | relations to (24) with which we started. This is the main idea of 
the proof. Further Balasubramanian (see [71, Appendix]) showed that these 2m — | relations 
can not hold whenever k is sufficiently large. For a different and more general proof of the 
preceding assertion, see Balasubramanian and Shorey [6]. Thus we may suppose that k is 
bounded and y is sufficiently large. Then we derive the polynomial identity 


(X+1)---(X+k) =(¥41)---(Y + mk), 


pam fee OW 35) ddlamage saree! oa | ae - 
This is not possible and the assertion (20) follows. 

We observe that k is a variable in equation (21). The left hand side of (21) is a polynomial 
in x of degree k and the right hand side is a polynomial in y of degree mk. Thus the 
exponents also appear as variables in equation (21). Therefore (21) is an exponential 
diophantine equation. Thus the above method can be considered as an extension of a 
method of Runge [64] to some exponential diophantine equations. We shall apply the above 
method to an equation of Goormaghtigh which we consider in the next section. Let 6 = 1 
or 6 = 2 according as m is odd or even, respectively. It has been conjectured in Saradha 
and Shorey [71] that Hinas(m,k) 4 Am+s(m,k). It is easy to see that Hy,i)(m,k) = 
Am+1(m, k) ifmiseven. Glesser [70, Appendix] and Mignotte and Shorey [47] checked the 
conjecture for m < 12 and 13 < m < 20, respectively. Further Mignotte and Shorey [47] 
showed that (21) with Hmn+is(m,k) 4 Am+s(m, k) implies that 


ys 1.026 x 2” (m +1+ 5) KMt14+9 (Imkymt2+9 


and 


y> ql/magk(L—1/m—(logy(mk))/mk) __ (=) 
- Z 
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where log, denotes the logarithm with respect to the base 2. Thus the above estimates are 
valid for 7 < m < 20. We recall that (21) with 2 < m < 6 has no solution other than 
m= 2. 2S 1. FSU Kh SB. 


3 An Equation of Goormaghtigh 


We start this section with the well-known a b c conjecture: Let € > O and a,b,c be 
relatively prime positive integers satisfying 


atb=c. 
Then there exists anumber x depending only on € such that 
c<KxG't 


where 


G= | | p. 


plabc 


Next we recall that Nagell [51] confirmed a conjecture of Ramanujan [60] that the solutions 
of 

x°+7=2" in integers x >0,n >0 (25) 
are given by (x,n) = (1, 3), (3, 4), (5,5), (11, 7), (181, 15). We consider an equation of 
Goormaghtigh 


l 
a inintegersx >l,y>1l,m>2,n>2withx ~y. (26) 


It has been conjectured that (26) has only finitely many solutions. Goormaghtigh [30] 
observed in 1917 that 


Pee i 903 — I 


31 = Si 
Dah Sad 2-1 90-1 


(27) 


There is no loss of generality in assuming that x < y in (26) and then m > n. Further we 
re-write (26) as 
(y— Dx" -@-1)y"*=y-x. (28) 


We show that a b c conjecture implies that (26) has only finitely many solutions. By 
applying a bc conjecture to (28) after dividing both the sides by g = gcd ((x — 1) y”, y—x), 
we have 

(y—1l)x" < KiG\** 


where € = 1/100, x; is an absolute constant and G, is the product of all prime divisors of 
xy(x — 1)(y — 1)(y — x)/g. Therefore 


yim —2—2e 2 Kyte 
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and we may suppose that y exceeds a sufficiently large absolute constant. On the other 
hand, we see from (26) that y2—! < 2x~! which implies that 


m—2—2€ = I 


(n—1)(m—2—2e€)/(m—1) 
: 


Xx 


Finally we combine the preceding two inequalities to conclude that n = 3. I thank 
A. Granville for pointing out an inaccuracy in the above application of abc conjecture 
in an earlier draft of this paper. Further N. Saradha observed that abc conjecture implies 
that (26) with n = 3 has only finitely many solutions. For this, we observe that m > 4 and 
we re-write (26) with n = 3 as 


(Qy +1)? =4e0e™ 1 4--- 4x) 41 


and we check that the roots of the polynomial 4(X”~! + ---+ X)+ 1 are simple. Then 
we may suppose that m > 6 by a result of Baker [3] on integral solutions of hyper-elliptic 
equations. Now we re-write (26) as 


Ax™ = (x —1)2y4+ 1)? 4+ Bx +1) 


and we apply a bc conjecture to conclude the assertion. 

It is not known that (26) with n = 3 has only finitely many solutions. Nesterenko and 
Shorey [53] proved that (26) with m = 1 (mod 2), m < 23 andn = 3 has no solution other 
than the ones given by (27). Letn = 3 andx = 21n(26). Then y > 2 and we re-write (26) as 


Oy fly aga". 


Now we apply the result of Nagell stated above in this section to conclude that (26) with 
x = 2 andn = 3 implies (27). Thus the numbers 31 and 8191 are given by Ramanujan- 
Nagell equation (25). Perhaps (26) has no solution other than the ones given by (27). If x 
and y are fixed, we read the exponents mod 3 to write (28) as Thue equations with fixed co- 
efficients and hence (26) has only finitely many solutions. Balasubramanian and Shorey [5] 
extended the preceding result by showing that (26) has only finitely many solutions when- 
ever x and y are composed of primes froma given finite set. Further Shorey [83] showed that 
this is also the case if gcd (x, y) = 1 andeither x, y— x or y, y — x are composed of primes 
from a given finite set. For a given m and n, Davenport, Lewis and Schinzel [20] proved 
that (26) has only finitely many solutions. The proof depends on the well-known theorem of 
Siegel [104] on integral solutions of polynomial equations in two variables and therefore the 
proof of Davenport, Lewis and Schinzel does not allow an explicit bound for the magnitude 
of the solutions. Further for a given m and n with gcd (m—1,n—1) > 1, Davenport, Lewis 
and Schinzel [20] gave an effective version of their result. Now the proof depends on an ele- 
mentary method of Runge [64]. Further Nesterenko and Shorey [53] showed that (26) with 
h = gcd (m—1,n—1) > | implies that max (x, y, m, n) is bounded by a number depending 
only on (m — 1)/h. This includes the effective version of Davenport, Lewis and Schinzel 
stated above. The proof depends on an extension of Runge’s method to exponential diophan- 
tine equations sketched in the last section. We counted the power of 2 on both the sides of 
(21) to show that k is small as compared with x. Analogous assertion for (26) should be that 
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m is small as compared with y. This is achieved by the theory of linear forms in logarithms. 
For a given pair (r,s) € (1, 1) of relatively prime positive integers, let S,, be the set of all 
pairs (m,n) = (1+dr, 1+ds) withd = 2,3,....This isan infinite set and the above result 
states that (26) has only finitely many solutions when the exponents m and n are restricted to 
the elements of S,.;. This is the first result on (26) of the type where the exponents m and n 
are unbounded and there is no restriction on variables x and y. We observe that (26) asks for 
integers with all the digits equal to one with respect to two distinct bases. Further we notice 
that 31 and 8191 are prime numbers such that w(N — 1) = 3 if N = 31 andw(N — 1) =5 
if N = 8191. Shorey [90] showed that 31 and 8191 are the only prime numbers N such 
that a(N — 1) < 5 and N has all the digits equal to one with respect to two distinct bases. 

For relatively prime positive integers A and B, a more general equation than (26), namely, 


(29) 


in integers x > 1, y > 1,m > 2,n > 2 with B(x — 1) # A(y — 1) has been considered 
in [5], [83], and [86]. It has been shown in [5] that (29) implies that max (x, y, m,n) is 
bounded by a number depending only on A, B and the greatest prime factor of xy. Let 
A, B, X and Y be positive integers satisfying 1 < A < X,1 < B < Y and B(X — 1) F 
A(Y — 1). Then it has been proved in [86] that the number of integers whose digits are all 
equal to A with respect to base X and equal to B with respect to base Y does not exceed 
24. That the estimate is independent of A, B, X and Y is an interesting feature of the 
preceding result. 


4 An Equation of Nagell-Ljunggren 


We consider the equation 


x” — 1 


7 = y? inintegersx > 1,y>1,m>2,g >2. (30) 
This equation asks for perfect powers whose digits are all equal to one with respect to base x. 
By writing y¥ = (y4/?)?, there is no loss of generality in assuming that g is a prime number. 
We observe that (x, y,m,qg) = (3, 11,5, 2), (7, 20, 4, 2) and (18, 7, 3, 3) are solutions of 
(30). It has been conjectured that (30) has no other solution. Ljunggren [39] confirmed 
it whenever g = 2. Therefore we always suppose that g > 2 1n (30). Further it follows 
from the results of Ljunggren [39] and Nagell [50] that (30) implies that 3 4m, 4 4m and if 
q = 3 then m = 5 (mod 6) unless (x, y,m,q) = (18, 7, 3, 3). The proofs of these results 
are also available in Ribenboim [61]. 

We show that a b c conjecture implies that (30) has only finitely many solutions. We 
re-write (30) as 
Ree ayy a 


Now we apply a bc conjecture with e = 1/8 to the preceding equation for deriving that 


yin —2—-2e 2 Koy? (31) 
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where k> is an absolute constant. On the other hand, we see from (30) that y? < 2x7! 
which implies that 


1 
m—2—2€ eee 


q(m—2—2€)/(m—1) 
a” : 


x 
By the results of Ljunggren and Nagell stated above, we may suppose that m > 5. Therefore, 
since € = 1/8 and q > 3, we see that the exponent of y in the preceding inequality is at 
least 3/2. This contradicts (31) since y is sufficiently large. 

From now onward in this section, we confine to a weaker version of the conjecture that 
(30) has only finitely many solutions. Shorey and Tijdeman [94] confirmed this conjecture 
when x is fixed. In particular there are only finitely many perfect powers with all the digits 
equal to one in their decimal expansions. This settles an old problem and the proof depends 
on the theory of linear forms in logarithms. Now we apply the above result of Shorey and 
Tijdeman to show that it suffices to prove the conjecture when m is a prime power. Let 
m = P“m, wherea > Oisaninteger, P = P(m), gcd (m,, P) = 1 and we re-write (30) as 


x=]. X¥ 


Sa ee Rae ee 
Re |) ee] , 


Let p be a prime dividing each of the factors on the left hand side. Then p divides my. Let 
v be the least positive integer such that x” = 1 (mod p). Then v divides P“ as well as 
p — 1. Thus either v = 1 or P < v < p < P since P = P(m). Consequently v = | which 
implies that that p = P divides m, contradicting gcd (m,, P) = 1. Therefore we conclude 
that the factors on the left hand side are relatively prime. Thus the second factor is a q-th 
power. Hence the assertion follows from the result of Shorey and Tijdeman stated above. 
The above factorisation on equation (30) appears for the first time in [84, Lemma 7] and 
it has been very useful. Shorey [84], [85] confirmed the conjecture when w(m) > q — 2. 
For the proof of this result, Shorey [84], [85] showed that (30) has only finitely many solu- 
tions when either m = 1 (mod q) or x is a qg-th power. The proofs depend on the theory 
of linear forms in logarithms and a result of Baker [1] on the approximations of certain 
algebraic numbers by rationals proved by Thue-Siegel hypergeometric method. The latter 
result already found application in Shorey and Tijdeman [94] that (30) with x = 10 and 
gq < 19has no solution. Le [38] proved that (30) has no solution whenever x is a g-th power. 
Further Le (Acta Arith. 69 (1995), 91-97) claimed that (30) with m = | (mod q) has no 
solution but his proof is not correct. A correct proof has been given by Bennett [9]. This is 
an immediate consequence of his result [9] that for integers a > O and n > 3, the equation 


(a+ 1)x" —ay” =1 


has no solution in integers x > 0 and y > O other than x = y = 1. A weaker version 
of the preceding result was given by Bennett and de Weger [10]. We replace the above 
stated results of Shorey that (30) has only finitely many solutions if x is a g-th power or 
m = 1 (mod q) by the results of Le and Bennett that (30) has no solution if either x is a 
q-th power or m = | (mod q) in the proof of Shorey that (30) with w(m) > q — 2 has only 
finitely many solutions. Then we conclude that (30) with w(m) > gq — 2 has no solution. 
Now we show that (30) with g|m implies that m is a power of g. We write m = q°m' 
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where e > 1 and gcd (m’,q) = 1. Further we may suppose that m’ is not divisible by a 
prime = | (mod q) otherwise the assertion follows as in [84] from the above factorisation 
on (30) and the result of Bennett . Now we re-write (30) as 


xm _ 1 yf |] 
er arm = yf, 
x? —] x—-1 


Let p be a prime dividing each of the factors on the left hand side. Then p divides m’. Let 
v be the least positive integer such that x” = 1 (mod p). Then v divides g° and p — 1. 
Therefore v = 1 since p # 1 (mod q). Now we see that p = q divides m’ contradicting 
gcd (m’, qg) = 1. Hence we derive that the factors on the left hand side are relatively prime. 
Therefore the first factor on the left hand side is a g-th power. Then we may suppose that 
m' =1orm’ =2sincex? = (x )? is a qg-th power. The latter possibility is ruled out 
since two consecutive positive integers cannot be q-th powers and the assertion is proved. 
On the other hand, a very particular case of the conjecture that (30) has only finitely many 
solutions when m is a power of g is an open problem. An easier question than the conjecture 
that (30) has only finitely many solutions is to replace w(m) > q — 2 by w(m) => 2 in the 
result of Shorey stated above. This is possible if m 1s divisible by g as derived above. 
Now we consider (30) with m even. Then m = 2n where m > | is odd by Nagell [50] 


and 
x" — ] 
=yf, +153 


x-—1 
where y; and y> are relatively prime positive integers greater than one. The second is 
the equation of Catalan. Catalan [17] conjectured that the only solution of this equation 
is given by x = 2, n = 3, y2 = 3, g = 2. In other words, the conjecture of Catalan 
states that 9 and 8 are the only perfect powers that differ by one. This remains open but 
Tijdeman [108] proved that the equation of Catalan has only finitely many solutions in 
integers inx > 1, yo > 1, n > 1, q > 1. Therefore we derive that (30) with m even 
has only finitely many solutions. More generally, Shorey and Tijdeman [94] showed that 
(30) has only finitely many solutions if m is divisible by a fixed prime. It has not yet been 
possible to show that (30) has no solution whenever m is even. Mignotte and Roy [46] 
proved that x" + 1 = y5 implies that 


max (n,q) > 10°, min (n,g)> 10°. 


For positive integers a, b and non-zero integer k, the equation of Catalan is a particular case 
of the following equation of Pillai: 


ax™ — by" =k inintegersx > 1,y>1,m>1,n > | with mn > 6. 


Pillai [56] conjectured that this equation has only finitely many solutions. This conjecture 
has been confirmed if at least one of the four variables in the above equation is fixed, see 
Shorey and Tijdeman [95, Chapter 12] which we also refer for a survey on exponential 
diophantine equations. This equation is known as Pillai’s equation. 
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The height of a rational number, in its reduced form, is the maximum of the absolute 
values of its numerator and denominator. Let f(X) be a polynomial of degree n with 
rational coefficients such that it has at least two distinct roots and f (0) 4 0. We write 


F(X) =agX" +a, X" !4+--- +4, 


where ao, Q,,..., A, are rational numbers with ag 4 Oanda,, 4 0. Let H be the maximum 
of the heights of aj with G < i < n and L be the number of i with 0 < i < n such that 
aj #0. Letm > 2, x and y with |y| > 1 be integers satisfying 


y" = f(x) 


which we suppose without reference in this paragraph. By a proper subsum of y” — f(x), 
we understand a proper subsum of y” — agx” — ayx"~! —--- —a, where the terms ajx"~! 
with aj = O are ignored. Schinzel and Tijdeman [79] proved that m is bounded by a number 
depending only on f. We conjecture: There exists a number Cs depending only on L and 
H such that either 


m<Cs5 


or y” — f(x) has a proper subsum which vanishes. We refer to the above conjecture as 
Conjecture |. For the proof of the conjecture, we may suppose that m exceeds a sufficiently 
large number depending only on L, H and y” — f (x) has no proper subsum which vanishes. 
Then we apply generalised a b c conjecture (see Darmon and Granville [18, p. 533]) to derive 
that n is bounded by a number depending only on L and H. Now we see from the theorem 
of Schinzel and Tijdeman stated above that m is bounded by a number depending only on L 
and H. Thus Conjecture | follows from generalised a b c conjecture. Further conjecture 1 
implies that the exponents in Pillai’s equation are bounded and then the conjecture of Pillai 
follows from the result of Baker [3] on integral solutions of hyper-elliptic equations. Next, 
we see that Conjecture | implies that either m < Cs or |x| < 2H?. Therefore Conjecture 1 
includes the theorem ot Schinzel and Tijdeman with f(0) 4 0 stated above. For positive 
integers m, x, “, vwithm > 1, x > 1, w > vanddA = (p™” — pny? we consider 
conjecture | in the following cases: 


f(X) = (X? -a)/4, x =e" +0", y= pv 


and 
F(X) = -AIY(X™ 1 +--+ XY $a, y=. 


We observe that no proper subsum of y” — f(x) vanishes in either of the cases. Since 
f(x) = (uv)” in the first case and f(x) = x” in the second case, we see that the 
dependance of C5 on L as well as H is necessary in Conjecture 1. Further it is clear that 
the assumption that f has at least two distinct roots such that f(0) 4 0 is also necessary in 
Conjecture 1. 

Let us consider the equation 


x™ 


— | 
= y?+1 inintegersx > 1,y >1,m>2,q >2. (32) 
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By subtracting one on both the sides of (32), we have 


We see that m > 4 since a product of two consecutive integers 1s not a power. Further we 
observe that the factors on the left hand side are relatively prime. Therefore 


where y3 > 1 and yg > 1 are integers. Then Shorey [89] applied his result stated above 
to conclude that (32) has only finitely many solutions. Further Le [38] concluded that 
(32) has no solution. This is an immediate consequence of his result that (30) has no 
solution whenever x is a g-th power. The first result in the direction of the preceding 
result 1s due to Inkeri [33] that (30) with g = 3 has no solution whenever x is a cube. 
Saradha and Shorey [74] proved that (30) with x = z? implies that z < 31 and z ¢ 
{2, 3,4, 8,9, 16, 25,27}. The proof depends on an estimate of Laurent, Mignotte and 
Nesterenko [36] on linear forms in two logarithms, irrationality measures of Baker [1] of 
certain algebraic numbers, p-adic method of Skolem as given by Le [37] and congruence 
arguments due to Inkeri [33]. Mignotte [45] has recently improved the above referred 
estimate on linear forms in two logarithms. By combining this estimate with estimates 
on solutions of certain Thue-Mahler equations and difficult and elaborate computations, 
Bugeaud, Mignotte, Roy and Shorey [15] showed that (30) with x = z* and z < 31, z ¢ 
{2, 3,4, 8, 9, 16, 25, 27} has no solution. The preceding result was proved, independently 
and differently, by Bennett [9]. Hence (30) has no solution whenever x is a square. The 
proof of Bennett depends on his result on the equation (a + 1)x” — ax” = 1 stated above. 
Further Hirata- Kohno and Shorey [54] proved that for a prime number yz > 3, equation (30) 
with x = z" andg > 2(u — 1)(2u — 3) implies that max (x, y, m, q) is bounded by a 
number depending only on jw. If ~ = 3, we may suppose that g 4 3 in view of a result of 
Inkeri stated above. Therefore (30) with x = 2° and g 45,7, 11 has only finitely many 
solutions. Linear forms in logarithms with @;’s very close to one appear in the proof. If a;’s 
are very close to one Shorey [80], by integrating the auxiliary function on a large circle, 
obtained for the first time sharp estimates for linear forms in logarithms and these are close 
to the best possible in the sense that log A; --- log A, can be replaced by log A where 
A = maxi<j<nAj;. The proof of Hirata-Kohno and Shorey shows that sharp estimates for 
these special linear forms in logarithms combine well with the irrationality measures of 
Baker [1], [2] proved by hypergeometric method. This feature appeared for the first time in 
Shorey [84]. See also Sections 7, 8 below and Mignotte [44], Bennett and de Weger [10] 
where linear forms in logarithms with as close to one find applications. 

We recall that (30) has only finitely many solutions if x is fixed. A natural p-adic 
extension of the above result is not known. In other words, we do not know whether (30) 
has only finitely many solutions if x 1s composed of primes from a given finite set. Even 
a proof of weaker assertion that (30) has only finitely many solutions if x is a power of 
an arbitrary fixed integer is not available. Saradha and Shorey [74] gave an infinite set Sy 
containing all integers in the inteval (1,20] other than 11 such that for h € S4 and integer 
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t > 0, equation (30) with x = h’ implies that max (x, y, m, q) is bounded by an absolute 
constant. We refer to [74] and [12] for an explicit construction of the set S4. Now we give an 
application of this result to show that certain numbers considered by Mahler are irrational. 
Let g > 2 andh > 2 be integers. For any integer n > 1, we writen = ayh’~'!+4+.--+a, 
for some integers r > O and 0 < aj < h for 1 <i < r with a; # 0. We define 
(n)p = a},..., a, 1.e. the sequence of digits of n written in h-ary notation. For a sequence 
{n;}°2, of non-negative integers, we put 


an(g) = 0.(g"!)a(g”)a- ee. 


Mahler [41] proved that aj0(g) is irrational for {nj}?2, = {i — 1}P2,. It is now known 


that a,(g) is irrational for any unbounded sequence {n;}°°, of non-negative integers, see 
Sander [65]. If an element occurs in a sequence infinitely many times, itis called a limit point 
of the sequence. Let {n;}?° , be a bounded sequence of non-negative integers. If it has only 
one limit point, it is ultimately periodic and hence ay, (g) is rational. We always suppose now 
onward that it has exactly two limit points Nj < N2 such that g¥2-™! 4 h + 1 whenever 
g™! < hand it is not ultimately periodic. Then the above result of Saradha and Shorey on 
(30) implies that for integers g > 2 andh € Sq, if ay,(g) is rational then Nz is bounded 
by an absolute constant. In fact Sander [65] claimed more general irrationality result but 
it depends on an unproved assertion on (30) stated in the beginning of this paragraph. The 
connection between (30) and irrationality of aj,(g) is due to Sander [65]. More precisely, 
Sander [65] showed that if a;,(g) is rational, then 


No—N, _ tl es | 
hi — | 


for some integer L > 2 and t given by h’—! < g™! < h'. Bugeaud, Mignotte and Roy [14] 
improved the assertion of Saradha and Shorey by showing that a,(g) with g > 2 and 
h € Sq is irrational unless g = 1+h+---+h*7! forevery L > 2 if (Nj, N2) = (0, 1) or 
(Ni, No, 2,h) = (0, 2, 11,3), (0, 2, 20,7), (0, 3, 7, 18), (1, 4, 7, 18). On the other hand, 
it 1s easy to see that ayz(g) is rational in any of the above possibilities. A more general 
statement of irrationality of a,(g) is available if g is even and h is odd. In this case, we 
observe that L appearing above in the result of Sander has to be even. Thus we are led 
to (30) with m even and it has only finitely many solutions. Hence we conclude that, for 
integers g > 2 and h > 2 such that g is even and h is odd, if ay,(g) is rational then N2 
is bounded by an absolute constant. For bounded sequences with more than two limit 
points, Sander [65] obtained some partial results and further investigations are desirable. 
For proving the above result on (30) applied to derive an irrationality criterion, Saradha and 
Shorey [74] showed that (30) implies that either max (x, y, m, g) is bounded by an absolute 
constant or there exists a prime p such that p|x and p {f(y — 1). M. Le (Acta Arith. 69 
(1995), 91-97) and L. Yu and M. Le (Acta Arith. 73 (1995), 363-365) claimed stronger 
version of the preceding result but their proofs are not correct. Correct proofs have recently 
been given by Bugeaud, Mignotte and Roy [14] who proved that (30) has no solution other 
than (x, y,m, q) = (18, 7, 3, 3) if every prime divisor of x divides y — 1. This implies their 
result on irrationality of a;,(g) stated above. This also includes a result of Bugeaud and 
Mignotte [13] that (30) with 2 < x < 10 has no solution settling a problem of Inkeri [33]. 


Exponential Diophantine Equations Involving Products 481 


Thus it is not possible to find a perfect power greater than one with digits identically equal 
to one in its decimal expansion. Further the result of Bugeaud and Mignotte that (30) with 
x = 3 has no solution finds application in group theory. We remark that p-adic linear forms 
in logarithms with a's p-adically close to one appear in the work of Bugeaud, Mignotte, 
Roy and sharp estimates of Bugeaud [12] for these p-adic linear forms in logarithms are 
utilised. The work also involves very heavy computations depending on a result of Bennett 
that (30) with m = 1 (mod gq) has no solution. 

Now we consider a more general equation than (30). For positive integers A, B and 
prime number g with AB > |, gcd (A, B) = | and q-free A, we consider the equation 


x” — | 


x—-1 


A 


= By? inintegersx > 1,y > 1,m>2,q > 2. (33) 


Oblath [55] showed that (33) with | < A < 10, B = 1 and x = 10 has no solution. 
Inkeri [33] determined all solutions of (33) with 1 < A < x < 10 and B = 1. Shorey and 
Tijdeman (see [94], [93]) showed that (33) implies that either max (x, y, m, q) is bounded 
by a number depending only on A and B or m = 2n with 


xt — | 


x-1 


= yg, A(x" +1) = By? 


where ys > 1 and ye > | are integers satisfying gcd (y5, y6) = 1 and ysy6 = y. Now itis 
clear from the above relations that it is easier to deal with (33) than (30). Therefore stronger 
results on (33) than (30) are available though we have not been able to show that (33) has 
only finitely many solutions. By combining the preceding result with the known results in 
the theory of exponential diophantine equation, Shorey [93] observed that (33) implies that 
max (x, y, m,q) is bounded by a number depending only on A, B and the greatest prime 
factor of x. As already mentioned, the preceding assertion remains unproved for (30). For 
integer x > 1 and prime number gq, we write U(A, B,x,q) and U(x, q) for the number 
of solutions of (33) and (30) in integers y > 1 and m > 2, respectively. Shorey [85] 
showed that 


U(x,q)<q+Ce 
where C¢ 1s an absolute constant. Le [38] sharpened it to 


Shorey [93] showed that 
U(A, B,x,q) < C7 


where C7 is a number depending only on A and B. We refer to Shorey [93] for stronger 
results on equation (33) when the exponents m run through infinite set satisfying 


gcd(m, ABg(AB)) = 1 (34) 


where ¢ is the Euler totient function. For example, (33) with x = z”, (34) and (33) with 
q = 2, (34) have no solution. 
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5 Equal Products of Integers in Arithmetic Progression 


In this section we shall continue applications of extension of Runge’s method to exponential 
diophantine equations sketched in Section 2. Further these applications result in developing 
the method. For positive integers m, d; and d2, we consider 


x(x + d))--- (x + (kK — 1)d1) = yy + d2)- ++ (y + (mk — 1)d2) in integers 
x>0,y>0,k > 2. (35) 


Equation (35) with d} = dz = 1 coincides with (18). Saradha and Shorey [72] showed 
that (35) with m > 2 and d,; = dz = d implies that max(x, y, k) is bounded by a number 
depending only on m,d and this includes (20) corresponding to the case dj = d2 = 1. 
If m = 2, Saradha and Shorey [73] proved that (35) implies that either max(x, y, k) is 
bounded by a number depending only on d;, dz ork = 2, d; = Dds. x = y* + 3dry. 
On the other hand, equation (35) with m = 2 is satisfied whenever the latter possibility 
holds. Saradha, Shorey and Tijdeman [77] proved that (35) with m = 2, d, = 1 and 
dy < k +1 implies that k < 35 and dy = 2° for some integer 2 > 2 and consequently 
they showed that 32.33 = 1.6.11.16, 207.208 = 8.13.18.23 are the only solutions of (35) 
with m = 2, d, = 1 and dy = 5, 6. Saradha and Shorey [73] proved that (35) with m > 2 
implies that k is bounded by a number depending only on m, d; and d2. For a fixed k, 
they [73] showed that (35) implies that either max(x, y) is bounded by a number depending 
only on m, d), dz or d\/d;" is a product of m distinct positive integers composed of primes 
not exceeding m. It is clear that the latter possibility can not hold whenever d; = d). 
Further it is replaced in [73] by m > a(k) where 


14 for2<k</7 
a(k) = | fork =8 
exp (k log k — (1.25475)k — log k + 1.56577) for k > 9. 


We observe that a(k) > 2568 fork > 9. Finally Saradha, Shorey and Tijdeman [75] proved 
that (35) with m > 2 implies that max (x, y) is bounded by a number depending only on 
d\, d2,k and m. In fact, for positive integers d;, dz, £ and m with £ < m,m > 2 and gcd 
(€,m) = 1, Saradha, Shorey and Tijdeman [75] showed that if x > 0, y > Oandk > 2 are 
integers satisfying 


x(x + dy) +++ (x + (€k — Idi) = y(y + d2)--- (y + (mk — I)d2) (36) 


then max (x, y, k) is bounded by a number depending only on d;, dz and m. Let f(X) be 
a monic polynomial of degree v > O with rational coefficients. Then we consider 


F(x) f(x +d))--- fle + (lk — Ddi) = f(y) fy + 42)--> Fy + (mk — I)d2) (37) 


in integers x, y and k > 2 such that f(x + jd,;) # OforO < j < €k — 1. We observe 
that equation (37) with f(X) = X is (36). Saradha, Shorey and Tijdeman [78] proved that 
(37) implies that k is bounded by a number depending only on d,, dz, m and f. Further, 
if f is a power of an irreducible polynomial, they [78] proved that (37) implies that max 
(|x|, |y|, k) 1s bounded by a number depending only on d), d2,m and f unless ¢ = 1, 
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Wak =) d= 2d3 and f(X) = (X +7)" with integerr, x +r = (y+r)(y +r 4+3d2). 
An extension of this result to all non-constant monic polynomials with rational coefficients 
remains unproved. Further it is clear from the proof that the preceding result extends to all 
non-constant monic polynomials with rational coefficients if we show that the underlying 
curve for (37) is irreducible. A proof for irreducibility in a particular case f(X) = X and 
(£,m,k) A (1, 2, 2) was given by Beukers, Shorey and Tijdeman [11]. A particular case 
d, = dz = ¢ = 1 of the result of Saradha, Shorey and Tiydeman on (37) was already proved 
by Balasubramanian and Shorey [6]. Saradha [66] considered (37) when f is reducible. 
Let s1,...,5, with > 2 be distinct positive integers. Suppose that x, y and k > | are 
integers satisfying 


pu (€k—1) LL (mk—1) 
I] Il (x-s + jdi)=[] I] (y — si + jd2) 
i=! j=0 i=! j=0 


such that the factors on the left hand side as well as the factors on the right hand side are 
non-zero and pairwise distinct. Then Saradha [66] proved that max (|x|, | y|, k) is bounded 
by a number depending only on dj, d2, £,m, w and s},..., 5, whenever 


m=2 or w =2,3,4 or d2= 1 


unless 
Nei S20 Ska d=, x=y'+y(l—s — $2) + 5152. 


Further we observe that the above infinitely many possibilities satisfy the preceding equation. 
Finally we turn to considering (35) with m = |: 


x(x + dy)-+- (x + (k — I)d)) = y(y + 42)-+- Cy + (k — V2). (38) 


This equation was suggested by Gabovich [29] in 1966. It is clear that (38) with k = 2 
has infinitely many solutions. Further Gabovich [29] gave an infinite class of solutions 
of (38) with k = 3,4. Some infinite classes of solutions of (38) with k = 5 were given 
by Szymiczek [106] and Choudhry [16]. Choudhry [16] also provided an infinite class of 
solutions of (38) with arbitrary k and unbounded d, d2. Finally we consider equation (38) 
with fixed d; and d2. If dj = dz, we observe that x = y. Therefore we suppose that d 
and dp are distinct. Further there is no loss of generality in assuming that d; < d2 and 
gcd (x, y, dj, dz) = 1. Then Saradha, Shorey and Tijdeman [76] proved that max(x, y, k) 
is bounded by a number depending only on d2 unless 


Rsk 1, y= 2,4; 1,0=4. (39) 
In view of the relations due to Makowski [42], 
(k + 1)--- (2k) = 2.6...(4k — 2) fork = 2,3,..., 


the possibilities (39) can not be excluded. The proof depends on Prime Number Theorem 
for arithmetic progressions. On the other hand, it is possible to make the proof inde- 
pendent of Prime Number Theorem for certain values of d; and dz. This led Saradha, 
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Shorey and Tijdeman [77] to show that all the solutions of (38) with d; = 1 and dz = 2, 
3,5, 6, 7,9, 10 are given by 2.3 = 1.6,7.8.9 = 4.9.14, 8.9 = 6.12,5.6 = 3.10, 4.5.6 = 
1.8.15, 15.16.17 = 10.17.24,9.10 = 6.15, 7.8 = 4.14, 24.25 = 20.30 and 32.33.34 = 
24.34.44. This follows by computations from their result that (38) with dj = 1 and 
dy <k +1 implies that y = 2 (mod 4) and d> = 2° for some integer @ > 2. 


6 The Greatest Prime Factor of Integers in Arithmetical Progression 
Let us consider equation (6) with t = k and P(y) < k. Then 
P=: P(n(n+1)---(n+k-—1)) <k. 


This leads us to considering lower bound for the greatest prime factor of the product of k 
consecutive positive integers. The first result dates back to Sylvester [105] in 1892 that 


P>k ifn>k. 


Ifn < k*/* and k is sufficiently large, it follows from well-known results on difference 
between consecutive primes that there is prime p satisfyingn < p <n+k—1. Therefore 
we shall consider lower bound for P whenn > k3/2. Then Erdés [24] proved that 


P > Cgk log k (40) 


where Cg > 0 is an absolute constant. We denote by 7 the number of integers n + i with 
O<i <ksuchthat P(n+ i) < k. Then we apply a fundamental argument of Erdos for 


deriving that 
(k3/2)T 2) < pT 7) < gk 


and the assertion (40) follows from Prime Number theory. Ramachandra [57] applied 
Selberg’s Sieve to prove (40) with Cg = (1 — €) fore > 0 and k exceeding a number 
depending only on €. Ramachandra [58] also obtained partial results for (40) with Cg = 
(2 — €). Combining these results with the method of Roth and Halberstam on difference 
between consecutive v-free integers, Tijdeman [107] proved (40) with Cg = 2 — e. Further 
Ramachandra and Shorey [59] sharpened (40) to 


log log log k 


1/2 
P > Cok log k ( ) for log k > e® (41) 


log log log log k 


where Cg > 0 is an absolute constant. The next improvement follows from the work of 
Jutila [34] and Shorey [81], namely, 


log log k 
P > Cok log k i co for logk > e (42) 
log log log k 


where Cig > Ois an absolute constant. The best possible estimate with log A in place of log 
A, --- log A, for linear forms in logarithms with a;’s close to one are crucial for the proofs 
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of (41) and (42). Furthermore the proof of (42) depends on estimates for exponential sums. 
We shall continue with the results on lower bound for P at the end of the next section. 
For an integer d > 1, we write 


Pg = P(n(n+d)---(n+(k—1)d)), x =n+(k—1)d. 


Improving on a result of Langevin [35], Shorey and Tijdeman [98] gave a stronger version 
of Sylvester’s theorem that 


Pi >k if d > land (n,d,k) 4 (2,7, 3). (43) 


It is necessary to exclude the possibility (n,d,k) = (2, 7,3) in the above result since 
P(2.9.16) = 3. The proof depends on estimates from Prime Number theory. Further Shorey 
and Tijdeman [99], [103] applied the theory of linear forms in logarithms for showing that 


Py > Cikloglog x if x > k(logk)* 


where € > O and Cj; > 0 is anumber depending only one. It is clear that the assertion 1s 
not valid if the assumption y > k(log k)£ is replaced by x > k(log logk)® withO < 6 < |. 
An analogue of (42) for Py with d > 1 is not known. 


7 Cubes and Higher Powers in Products from a Block of Integers 


Let us recall the result of Erdos and Selfridge mentioned in the beginning of this paper. It 
states that a product of two or more consecutive positive integers is never a cube or a higher 
power. This is equivalent to saying that equation 


n(n+1)---(n+k— ly=y’ in integers n>O,y>0,k >2,€>2 (44) 


has no solution. The proof of Erdos and Selfridge depends on an earlier method due 
to Erdos [25] of 1955 that (44) does not hold if k exceeds a sufficiently large absolute 
constant. Another proof of this assertion was given by Erd6s and Siegel (unpublished). 
Now we explain the method of Erdés by giving a sketch of the proof that (44) does not hold 
if k is sufficiently large. 

Suppose that (44) is satisfied and k exceeds a sufficiently large absolute constant. As in 
Section 1, we apply Bertrand’s postulate and Sylvester’s theorem to conclude that n > k°. 
Further we write 

n+i=ajx! forO<i <k 


where a; is €-free and P(a;) < k. We see from n > k® that the elements of Ss = 
{ao, a1,..., ag—1} are distinct. Then we apply a fundamental argument of Erdos for finding 
a subset Sg of Ss containing at least k/2 elements and a; < 6k for a; € Sg. Further we 
see from n > k* and @ > 3 that aja; with a; € S6,aj; © S6 are distinct. This property, as 
mentioned in Section |, implies that |S6| < 2k/ log k. This is a contradiction. 
As in Section 1, we consider a more general equation analogous to (6) for cubes and 
higher powers, namely, 
(n+di)---(n+d;) = by* (45) 
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in integersn > 0, y > 0, € > 2, k > 2, dj,...,d; and b. Equation (45) witht = k 
and b = | is (44). We recall that d},...,d; with tf > 2 are distinct integers in [0, k) 
and P(b) < k. We always suppose that the left hand side of (45) is divisible by a prime 
exceeding k. Then Saradha [67] proved that (45) with t = k and k > 4 has no solution. 
This extends a result of Erd6és [23] on 


n+k Sh oloal 
k J? 


Further Gyory [32] proved that (45) with t = k and k = 2,3 has no solution. A theorem 
of Wiles [110] on the most famous exponential diophantine equation, namely Fermat’s 
equation, states that 


x" 4+ y" = 2" inintegers n> 2,x>0,y>0,z>0 
has no solution. This led to a striking theorem on equation 
x" ee y" as ro. (46) 


Gyory derived his result from the theorems of Ribet [62] and Darmon and Merel [19] that 
ifx > 0, y>0, z>0, n > 2,q@ > 1 are integers satisfying x, y, z relatively prime, n 
prime, a < n and (46), thenx = y=z= 1. 

Erdos [25] observed that his elementary method sketched above yields that for € > 0, 
equation (45) witht > k—(l—e)k Sia implies that k is bounded by a number depending 
only on e. We put 


n= y(t 40? — 8647 
an) 2(@ — 1)(202 — 5@ 44) )° 


We observe that 


47 45 
—, vg = — and vy < 2/3 for’ >5. 
56 64 


Shorey [84], [87] improved considerably the result of Erdos by proving that (45) with 


SE 


t > vek (47) 
implies that k is bounded by an absolute constant. Further Shorey [84] proved that (45) with 
t>ke "4 acy 42 (48) 


implies that min (k, £) is bounded by an absolute constant. In fact the exponent 1/11 in (48) 
can be replaced by (1/3) + € fore > 0, see [52]. The assumption (47) has been relaxed by 
Nesterenko and Shorey. For stating their result, we introduce the following notation 


112¢7—160€4+29 -¢ 7 
» | 280-760-429 if € = | (mod 2) 


vy, = 
t 2 
112¢2—160€4+17 ip p — 

Be iagecing Ut =O (mod 2). 


Exponential Diophantine Equations Involving Products 487 


For £ > 7, we observe that v, > 3/¢, 


7(1 — case) if £2 = 1 (mod 2) 


es ; 
AG toe (i. TaD?) if £=0 (mod 2) 
and 
V1 < 4832, vg < 4556, vb < .3878, vio < .3664, 
Viy < 3243, vio < .3076, viz < .2787, vj4 < .2655. 


Nesterenko and Shorey [52] proved that (45) with 
€>7, t > vk 


implies that k is bounded by a number depending only on €. The proofs depend on the theory 
of linear forms in logarithms, irrationality measures of Baker proved by Pade approximations 
and the method of Roth and Halberstam on difference between consecutive v-free integers. 
Here sharp estimates for linear forms in logarithms with @;’s close to 1 are crucial and, 
as remarked in Section 4, they combine well with the irrationality measures of Baker. We 
suggest to examine that for € > 0, equation (45) with t > ek implies that k is bounded by a 
number depending only one. We have not been able to derive this even froma b c conjecture. 

Let e > O and k exceeds a sufficiently large number depending only on e. It has been 
conjectured that there is a prime dividing n(n + 1)---(n + k — 1) to the first power. If 
n < k?/*, the conjecture is confirmed by the well-known results on difference between 
consecutive primes. Thus we suppose that n > k3/*, Let € > 3 and 


WE Ve if 0.2 3,4.5.6 
aaa ee ce 


We derive from the above results that there exists a prime p satisfying 
p>(1—v —6)klogk 


and 
ord,(n(n + 1)---(n +k — 1)) #0 (mod £) 


whenever k exceeds a number depending only on € and @. For a prime p between k and 
(i- v, —€)k log k dividing n(n+1)---(n+k—1), we omit the uniquen+iwithO <i <k 
such fei p divides n + i and we denote the remaining ones fromn,n+1,...,n+k-—1 
by n+ dj,...,n + d;. We observe from Prime Number theory that t > vk and we may 
suppose that equation (45) is satisfied. If n < k*, the assertion follows from (42). Thus we 
may suppose that n > k® and we see from a fundamental argument of Erdos that the number 
of i with | <i <tand P(n+d,) < kis at most k€~! + 2(k). Therefore the left hand side 
of (45) is divisible by a prime exceeding k. Finally we apply the above results of Shorey and 
Nesterenko to conclude the proof. A weaker version of the assertion is contained in Shorey 
and Tijdeman [102]. By a similar reasoning, the above assertion is valid with p > (1 —€)k 
log k whenever min (k, £) exceeds a number depending only one. 
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8 Perfect Powers in Products of Integers in Arithmetic Progression 


Fermat (see [49, p. 21]) proved that there are no four squares in arithmetic progression. As 
stated in the preceding section, Darmon and Merel [19] proved that 


x" 4 y" = 22" 


has no solution in positive integersn > 2, x, y and z other than the ones given by x = y = z. 
Thus for € > 2 there are no three ¢-th powers in arithmetic progression. For earlier results 
in this direction, see [21], [96] and [100]. Now we consider a more general situation. Let 
b, dandk > 2 be positive integers such that P(b) < k. We consider 


n(n+d)---(n+(k—1)d) = by* in integers n > QO, 
y>O,k > 2,2 > 2 with ged (n,d) = 1. (49) 


The first result on (49) is due to Euler (see [49, p. 21]) that a product of four positive integers 
in arithmetic progression is never a square. A related question is that a product of four 
positive integers in arithmetic progression can not be a product of two consecutive positive 
integers. The answer to this question is negative. Here the underlying equation is (35) with 
m = 2, dj = 1, k = 2 which, as pointed out in [77], has infinitely many solutions in positive 
integers x, y and dy. Our survey of the case d = | of (49) is already complete in Section 7 
and we assume that d > 2 in this section. Then we see from (43) that P(y) < k if and only 
if (n,d,k) = (2,7, 3). Thus the left hand side of (49) is divisible by a prime exceeding k 
whenever (n, d,k) 4 (2, 7, 3). There is no loss of generality in assuming that @ is a prime 
number in equation (49) and the subsequent equations (50), (51). Erd6és conjectured that (49) 
implies that k is bounded by an absolute constant. We shall show at the end of this section 
that the conjecture of Erdés with / > 3 is a consequence of a bc conjecture. Saradha [67] 
proved that (49) with d < 6andk > 3 has no solution. If 2 = 2, Saradha [68] showed that 
the assertion is valid even whend < 22 unless (n, d,k) = (2, 7,3), (18, 7, 3), (64, 17, 3). 
Marszalek [43] confirmed the conjecture for fixed d. If € > 3, Shorey [88] proved that 
(49) implies that k is bounded by a number depending only on the greatest prime factor 
of d. Further Shorey and Tijdeman [97] proved that (49) implies that k is bounded by a 
number depending only on @ and w(d). It is an open problem to bound k by a number 
depending only on w(d) in the preceding result. If £2 > 7, Shorey [92] showed that k is 
bounded by a number depending only on n whenever (49) is satisfied. The proof of the 
former result is elementary whereas the proof of the latter depends on combining sharp 
estimates from linear forms in logarithms with @;’s close to one and irrationality measures 
proved by hypergeometric method. 

Now we relax the assumption gcd (n, d) = 1 in the above stated results on (49). For this, 
we consider 


n(n+d)---(n+(k—1)d) = by* inintegersn > 0, y>0,k >2,€>2 (50) 


where we do not subject the solutions to restriction gcd (n, d@) = 1. We combine the result 
of Saradha stated above with the results of Saradha and Gyory of Section 7. We derive 
that (50) with d € {2,3, 4,6} and k > 3 does not hold unless k = 3, € = 2 andn = 48d 
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if the left hand side of (50) is divisible by a prime exceeding k. Further the assertions of 
Marszalek and Shorey are also valid for (50) such that its left hand side is divisible by a 
prime exceeding k. Finally we turn to relaxing the assumption gcd (n, d) = 1 in the result 
of Shorey and Tijdeman. For this, we consider an extension of (45) and (49): 


(n + d\d)---(n+d;d) = by* (51) 
in integersn > 0, y > 0, €> 2, k > 2, d},...,d,; and b such that gcd (n,d) = 1 and 
P(b) < k. We recall that d},..., d; with t > 2 are distinct non-negative integers less than 


k. We suppose that the left hand side of (51) is divisible by a prime exceeding k. Lete > 0 
and 

_ | log log k iff >5 
i ae | log loglogk if £= 2,3. 


Then a consequence of a result of Shorey and Tijdeman [102] states that equation (51) with 


h(k) 
t>k-—(1—e)k — 
log k 
implies that k is bounded by a number depending only on £ and w(d). This includes a result 
of Shorey and Tijdeman stated above on (49). In view of Balasubramanian and Shorey [8, 
Theorem 2], the assumption that the left hand side of (50) is divisible by a prime exceeding 
k is necessary. 

We apply the preceding result of Shorey and Tijdeman to show that (50) with d {n implies 
that k is bounded by a number depending only on @ and w(d). Suppose that (50) with d Jn 
is satisfied and k exceeds a sufficiently large number depending only on @ and w(d). There 
is no loss of generality in assuming that every prime divisor of y is greater than k. We put 


G = gcd (n,d), G = GiG2 
where P(G,) < k and every prime divisor of G2 exceeds k. Further we set 
n’=nG', d’=dG"', b' =bG{". 


We observe that gcd (n’, d‘) = | andd’ > 2 since d jn. By dividing both the sides of (50) 
by G*, we see that 
n'(n' +d’) oe (n’ ae (k 1)d') ae b'y’G;* 


where b’ and y'G;" are positive integers such that P(b’) < k. Fora prime p dividing G2 
and y’Gs", we Observe that p > k and there is precisely one integer F(p) in [0, k) such 
that p divides n’ + F(p)d’. We write {d\, ..., d;} for the set obtained by deleting all F(p) 
with p dividing gcd (G2, y°G3“). We observe that 


t>k-— min (a(n), @'(d)) > k —@(d) (52) 
where w’(v) denotes the number of prime divisors of v exceeding k and 


n'(n' + d\d')---(n' + d;d') = byt (53) 
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where b; and y; are positive integers such that P(b,) < k and all the prime divisors of y, 
are greater than k. We show that the left hand side of (53) is divisible by a prime exceeding 
k. We prove by contradiction. Let « = i and suppose that the greatest prime factor of the 
left hand side of (53) does not exceed k. Then we apply a fundamental argument of Erdés 
for deriving that 
t—m(k)-1 
[| @'+id) <k# 
j=0 


which implies that n’ < 4k and d’ < 8. Now we apply Prime Number Theorem for 
arithmetic progressions and (52) to derive that there are at least (1 — €)d’k/(p(d’) log k) 
prime numbers among n’+d,d’,...,n'+d,d'. Therefore there are at least (1 —€)d’k/(p(d’) 
log k) prime numbers p satisfying p = n’ (mod a’) and p < k. This, since d’ > 2, is not 
possible by Prime Number Theorem for arithmetic progressions. Now we apply the result 
of Shorey and Tijdeman [102] stated above to (53) for concluding the desired assertion. 

For a result analogous to the last assertion of Section 7, see Shorey and Tijdeman [102]. 
Several estimates on equation (49) have been proved. For example, it is proved in [97], [101] 
and [92] that (49) implies that 


d > k©2 '8loek = pig) > Cy2£ log k log log k 


and 
n > 4O12 log log & if@>7 


where C}2 > O is an absolute constant. Finally we apply the above estimate for d to 
conclude that a b c conjecture implies the conjecture of Erdés on (49) with > 3. We 


suppose (49) with / > 3 and k exceeding a sufficiently large absolute constant. Then 


we write 
| 


n+id=aj;x; for 0<i<k 


where P(a;) < k and every prime divisor of x; exceeds k. By a fundamental argument of 
Erdés, we find positive integers f < g < hsuchthatay, ag, a, donotexceed k*. We have 


(g—f)(n+hd)+ (h—g)(n+ fd) = (h— f)(n + gd) 


(g — franx, + (A ~ g)apxy = (h — faery. 


We observe that X < kx, which implies that x ¢ < kx since! > 3. Similarly x, < kXxg. 
The greatest common divisor v of the summands on the left hand side in the above relation 
do not exceed k? and we divide both the sides of the above relation by v to apply a bc 


conjecture. We conclude from a b c conjecture with e« = 1/6 that 
n+ed= AgX, < aa aa 


Therefore n + gd < k® since! > 5. Finally we conclude fromn+ gd > d > k©12 108 logk 
that k is bounded by an absolute constant. 
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Algebraic Independence of Transcendental 
Numbers: A Survey 


Michel Waldschmidt 


A survey on algebraic independence of transcendental numbers by Gel’ fond’s method was published by the 
author in 1984 [84 W1]. Here we cover the recent period 1984/1997. 


Introduction 


Important progress concerning algebraic independence of transcendental numbers has been 
achieved recently. Our goal is to introduce some of the most important ones. So many new 
results have been produced that we need to make a choice: here we wish to concentrate 
our survey on the description of new results which are related with Gel’fond’s method. 
However we shall also include other topics (like Nesterenko’s results on modular functions). 
Moreover our list of references will cover a broader scope. 

Here is a short list of some of the many topics which are not covered here. 


e Works involving the method of Siegel and Shidlovskii, including values of hypergeo- 
metric functions. One main reference on this topic is Shidlovskii’s book [87 Shi2] and 
[89 Shi2]. Among many contributions to this subject, we mention those of Shidlovskii 
himself [87 Shil], [89 Shil], [89 Shi3], [91 Shil], [91 Shi2], as well as his work with 
Yu.V. Nesterenko [96 NS]. Several papers have been produced by other members of 
Shidlovskii’s school, including Yu.V. Nesterenko [88 Ne], V. Kh. Salikhov [84 Sal], 
[89 Sal], [90 Sal1], [90 Sal2], V.A. Kulagin [87 K], [91 K], [92 K], [96 K], N.I. Lossov 
[89 Lo], M.A. Cherepnev [90 Chr], [91 Chr], [94 Chr], [95 Chr], A.I. Zhukov [91 Zh] 
and V.A. Gorelov [92 Go]. Further references are [84 BV], [85 Br], [86 Br], [88 Beu] 
and [88 BeBrH]. Most of these articles are devoted to the study of algebraic inde- 
pendence of the values of solutions of differential equations, applying Siegel’s or 
Shidlovskii’s main theorems. 

e The method of Bézivin-Robba [89 BR], [90 BBR], which produces a new proof of the 
Lindemann-Weierstra} Theorem. We refer to Y. André’s papers [96 An] and [98 An2], 
which explain connections with E-functions as well as with G-functions. 

e Mahler’s method. Anexcellent reference on this subject is Kumiko Nishioka’s Lecture 
Notes [96 Nil] where many further references are provided. See also [86 Nil], 
[86 Ni2], [86 Ni3], [87 Ni], [88 Bec], [89 BN], [89 Ni], [90 Ni], [90 NN], [90 Wal], 
[91 Am], [91 Bec], [91 Ni], [92 Bec], [94 Ni], [94 T61], [95 T61], [95 T62], [96 Ni2], 
[96 Ni3], [96 Ta], [97 Ni] and [97 To]. 
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e Liouville’s argument yields not only transcendental numbers, but also algebraically 


independent numbers. This way of constructing algebraically independent numbers 
is related with results of transcendence or algebraic independence on values of gap 
series. Many authors have studied this question (see a survey in [90 W1]), including 
Zhu Yao Chen [84 Z], [85 Z2], [85 Z3], [87 Z], [88 Z], [89 WZ], [91 Z1], [91 Z2], as 
well as W.W. Adams [85 Ad], M. Amou [85 Am], [91 Am], P. Bundschuh [88 Bu], 
[90 Bu], R. Muller [93 Mu] and T. Topfer [94 T62]. 

Transcendence and algebraic independence in finite characteristic. The recent survey 
by W.D. Brownawell [98 Br] gives a good description of this rich theory. including the 
important achievements by Jing Yu [92 Y], [97 Y] as well as the works of A. Thiery 
[92 Th], W.D. Brownawell [93 Br], [94 BBT], [96 Br], R. Tubbs [96 Tu] and L. Denis 
[93 Den1], [93 Den2], [95 Den1], [95 Den2], [96 Den], [97 Den]. We also quote the 
work [93 Mii] by R. Miller which involves non archimedean valued fields. 

Also related to this topic is the paper [88 AMP] by J-P. Allouche, M. Mendés-France 
and A.J. van der Poorten, dealing with the algebraic independence over F(X) of 
powers f*!,..., £45, where f is a formal power series, with coefficients in a finite 
field F, which is algebraic over F(X) . 


e Metric theory and classifications of transcendental numbers. 
e Lower bounds for linear forms in logarithms of algebraic numbers. 


In this survey we first state a few conjectures (§1). Next we describe the methods of 
transcendence and algebraic independence (§2); some proofs of algebraic independence 
results involve a transcendence criterion, other ones rest on an effective version of Hilbert’s 
Nullstellensatz; a more recent approach is via simultaneous Diophantine approximation. 
Finally we state some of the most important recent progress on this topic, starting with 
Gel’ fond’s method (§3) considering results of algebraic independence related to the modular 
functions (§4) and finishing with Philippon’s Diophantine rings (§5). 

We would like to point out two important directions related with Gel’fond’s method of 
algebraic independence: 


e Commutative algebraic groups: the currently available tools and methods are well 


adapted to this geometrical setting, and a rather strong set of general results are avail- 
able (even if we are far from being able to establish the main conjectures). A survey 
of this topic is [96 Ca]. 


e Diophantine approximation and algebraic independence. The connection between 


these two topics opens the way to a promising field of research. We discuss some 
of these new advances below (§2.4); a good starting point to get acquainted with this 
subject is [97 RW2]. 


1 Conjectures 


In this section we collect various conjectures of algebraic independence. We start with 
Schanuel’s one (see [84 W1] for references): 
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Conjecture 1.1 (Schanuel). Let x,,..., x, be Q-linearly independent complex numbers. 
Then the transcendence degree over Q of the field 
OM paneG Xe cae) 


is at least n. 


The following problem of A.O. Gel’fond and Th. Schneider (again, see [84 W1]) is a 
special case of Conjecture 1.1: 


Conjecture 1.2 (Gel’fond-Schneider). Let @ be a non zero complex algebraic number, 
log @ a non zero logarithm of @ and £ an algebraic number of degree d > 2. For z € C, 
define a := exp(zloga@). Then the d — | numbers 


Z d-1 
a aa? 
are algebraically independent. 
It is equivalent to say that for algebraic numbers B,,..., Bm which are pairwise distinct 
modulo Z', the numbers 
a?! ; am 


are linearly independent over Q (or over Q, as you wish). However partial results are known 
on the question of algebraic independence, but almost nothing is known on the question of 
linear independence! 

In 1949, A.O. Gel’fond solved Conjecture 1.2 in the case d = 3. More generally, he 
proved that for d > 3, two at least of the d — 1 numbers in Conjecture 1.2 are algebraically 
independent. Such a result, which shows that some field has transcendence degree > 2 over 
Q, is a statement of “small transcendence degree’. Showing that the transcendence degree 
tends to infinity with d (as achieved first by G.V. Chudnovsky) is proving a result of “large 
transcendence degree”. Working with the exponential function, A.O. Gel’ fond introduced 
two sets of Q-linearly independent numbers, (x;,...,x@) and (yj,..., ye), and proved 
results of small transcendence degree involving the d@ values ei (1 <i<d,1l <j < ®@). 

Available methods of algebraic independence on such values of the exponential function 
which yield “large transcendence degrees” all require a so-called “technical hypothesis’, 


namely a measure of linear independence for the tuple (x, ..., xg) as well as for the tuple 
(View ees Ye): 

Definition: We shall say that a set {x,,...,x,} of Q-linearly independent complex 
numbers satisfies a measure of linear independence if, for any € > O, there exists a posi- 
tive number Hp such that, for any H > Hp and n-tuple (h1,...,H,,) of rational integers 
satisfying O < max{|hj|,..., ||} < AH, the inequality 


Jnyxy +--+ +hAnxn| = exp{—H*} 
holds. 


I This means Bj — Bj ¢ Z for iF). 
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As a matter of fact, in the original papers, the actual results which we are going to discuss 
sometimes involve weaker assumptions (see [98 FN] for a more careful discussion). But 
we shall use this definition for convenience and simplicity. 

Schanuel’s Conjecture suggests that such a technical hypothesis should not be necessary, 
and it is a challenge to remove it (see [92 Del], [93 Del], [96 Dell] and [96 Del2]). Unfor- 
tunately, so far, only results of small transcendence degree are free of such an assumption. 

By the way, we take this opportunity to point out that the generalization of Schanuel’s 
Conjecture which is suggested p. 566 of [84 W1] is too optimistic: it does not hold without 
a technical hypothesis, as shown by the following result of A. Bijlsma? 


Lemma 1.3 (Bijlsma). For any fixed natural number k, there exist irrational numbers 
a,b € (0, 1) such that for infinitely many triples (a, B, y) of rational numbers 


max (la — @|, |b — Bl, la” — yl} < exp(—(log H)*), 
where H denotes the maximum of the heights of a, B and y. 


Recall that the usual height of a polynomial P with rational (and more generally complex) 
coefficients is the maximum of the absolute values of its coefficients, and the (usual) height 
of an algebraic number is the usual height of its minimal polynomial over Z. 

In place of the generalization of Schanuel’s Conjecture suggested in [84 W 1] section IV. 2, 
we now propose the following statement, whose assumptions include a technical hypothesis: 


Conjecture 1.4 Let x), ..., x, be Q-linearly independent complex numbers which satisfy 
a measure of linear independence. Let d be a positive integer. Then there exists a positive 
number C = C(x), ..., Xn, d) with the following property: for any integer 7 > 2 and any 
n+ 1 tuple P},..., P,+1 of polynomials in Z[X,..., Xn, Y1,..., Y,] with degrees < d 
and usual heights < H, which generate an ideal of Q[X),...,X,, Y1,..., Yn] of rank? 


n+ 1, we have 
n+l 


» Pp Oieunp ine 42g lS H~S, 
= 


One of the most important special cases of Schanuel’s Conjecture deals with the set 
C={eecC;e€O }. 
of logarithms of algebraic numbers: 


Conjecture 1.5 (Conjecture on algebraic independence of logarithms of algebraic 
numbers) If 2;,..., 2, are Q-linearly independent elements of £, then they are algebraically 
independent. 


2Bijlsma, Alex — On the simultaneous approximation of a,b and a’. Compositio Math. 35 (1977), no. 1, 
99-111. 

3The rank of a prime ideal B C Q[T],..., Tm] is the largest integer , > 0 such that there exists an increasing 
chain of prime ideals (0) = Bo C Py C--- CP, =P. 
The rank of an ideal J c Q(T),..., Tm] is the minimum rank of a prime ideal containing J. 
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A formal analog of Conjecture 1.5, which answers a question of Y. Hellegouarch, has 
been established by D.L. McQuillan [85 Mc]: 

Let k be a field of zero characteristic, P,, P2, ...non constant polynomials in k{x] which 
are pairwise relatively prime and satisfy P;(O) = 1. Define, fori > 1, 


ae 
LogP} =—)> —(1 — P(x))" € k(x). 


n=1 
Then Log P|, Log P2, ... are algebraically independent over k(x). 


Let us come back to the classical case. From Conjecture 1.4 one deduces the following 
quantitative version of Conjecture 1.5: Letn > 2 and d > | be positive integers and 


aj,...,@y be multiplicatively independent positive integers, i.e. positive elements in Z 
such that the only relation ay! . ain = | with (bj,...,bn) € Z arises from the trivial 
case bj = --- = by, = 0. Then there exists a constant C > 0, depending on n, aj, ..., Qn 
and also on d, such that, for any integer H > 2 and any non zero polynomial P in 
Z[X1,..., X»] of degree < d and height < H, 

|Pdlogay,...,loga,)| > H~. 


Only the case of anhomogeneous linear form P = b, X,+---+b,X, 18s known (Theorem 
of Baker-Fel’dman). 

The work of D. Roy [92 R] yields a new approach to Conjecture 1.5, involving matrices 
whose coefficients are linear combinations, with algebraic coefficients, of elements of L. 
For such a matrix M, say of sized x @, he defines the structural rank r...(M) of M as the 
rank* of the matrix M obtained from M by replacing with indeterminates a basis of the 
vector space over Q spanned by the coefficients of M: if 


M = Mo + M,e, +---+ Mglg, 


where I, €),..., €, are Q-linearly independent, €; « £(1 < j < k)andM; € Maty,.¢(Q) 
(O < j <k), then 


M = Mo +M,X, +---+MiXp € Matgye(Q(X1,..., Xx): 


It is neither surprising nor difficult to check that if Conjecture !.5 holds, then the rank of M 
is equal to rg,(MZ). What is remarkable is the converse: if the rank of M is always equal to 
rstr(M), then Conjecture 1.5 ts true. 

In [95 R], D. Roy studies the set of points, whose coordinates are logarithms of algebraic 
numbers, which belong to a given algebraic variety. This work may be viewed as a first step 
towards the conjecture on algebraic independence of logarithms, since D. Roy remarks in 
[95 R] that Conjecture 1.5 is also equivalent to the following statement: For each positive 
integer n and each algebraic subvariety V of C" defined over Q, the set VL" is contained 
in the union of all vector subspaces of C” defined over Q and contained in V. 


+The rank of a matrix is the rank over the field generated by its entries. 
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Further related conjectures of algebraic independence dealing with commutative 
algebraic groups are discussed in [96 W]. 

Conjectural extensions of Lindemann-Weierstra8 Theorem to algebraic groups are 
considered in [87 P]. 

A conjectural description of the ideal of relations of algebraic dependence between 
periods of algebraic varieties which are defined over the field of algebraic numbers has 
been proposed by A. Grothendieck in 1966 and made more precise by S. Lang the same 
year. A joint generalization of Grothendieck’s and Schanuel’s Conjectures is suggested 
by Y. André [98 Anl], who also proposes a transcendence conjecture on uniformization 
of Shimura’s varieties, and discusses a non abelian variant of Grothendieck’s Conjecture 
(see [97 Pr]), due to C. Simpson [90 Si], concerning the Riemann-Hilbert correspondence. 
Moreover in [98 An1] one can find comments on links between these conjectures and sev- 
eral topics, especially the Conjectures of Rohrlich and Lang on the algebraic dependence 
relations between values of the Gamma function, the results of Nesterenko on the modular 
functions (see §4 below), and the results of P. Cohen, H. Shiga and J. Wolfart, who extend 
in higher dimension Schneider’s Theorem on the modular invariant /. 

Notice that the conjectures in [86 W] are much less ambitious (and therefore should 
be easier to reach) than the general ones of Y. André in [98 Anl]: they are more closely 
connected with currently available methods. 

A conjecture in the “folklore” is that the zeroes of the Riemann zeta function (say their 
imaginary parts, assuming it > 0) are algebraically independent. As suggested by J-P. Serre, 
one might be tempted to consider also 


e The eigenvalues of the zeroes of the hyperbolic Laplacian in the upper half plane 
modulo SL2(Z) (i.e. to study the algebraic independence of the zeroes of Selberg zeta 
function). 

e The eigenvalues of the Hecke operators acting on the corresponding eigenfunctions 
(Maass forms). 


2 Methods 


2.1 Transcendence and Algebraic Independence 


Given functions fj, ..., fg and points uj, ..., ug, acentral problem is to give, under suitable 
assumptions, a lower bound for the transcendence degree of the field generated over Q by 
the dé numbers fj(uj) (1 <i < d,1 <j < @). Here is an example with d = 2, € = 1 
and f1(z) = z: consider a given function f and an algebraic number @ such that f(q@) is 
also algebraic; one wishes to deduce that a belongs to a restricted set. If f(z) = e*, the 
restricted set is just {0}: if f(z) = e'”%, the restricted set is Q. 

In general, one deals with a finite set of numbers {@;,..., 6,}. In the last example one 
would taken = 2,6; = a, 62 = f(a) 

The basic methods of transcendence usually involve the construction of some number y 
in the field K = Q(6,,..., 6,). This number y typically is either the value of an auxiliary 
function, or the determinant of an interpolation matrix. 

To start with, assume that all numbers 6), ..., 0, are algebraic. Therefore the number y 
also is algebraic. Three main steps in the proof are: 
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e Zero estimate’: This number y does not vanish. 
e Arithmetic estimate: Lower bound for |y|. 
e Analytic estimate: Upper bound for |y|. 


We are interested here in the lower bound, which is a Liouville type argument: the size 
inequality, or the product formula, always rest on the fact that a non zero rational integer 
has absolute value at least 1. 

Such a method yields the transcendence of values of f at points which do not belong to 
the restricted set. In general, it enables one to prove that some fields Q(6),..., 6,) have 
transcendence degree > | over Q. Variants of this argument yield measures of simultaneous 
approximation by algebraic numbers: a quantitative refinement of the transcendence result 
is alower bound for max) <j<n |0;—y;|, when y1, ..., ¥, are algebraic numbers (sucha lower 
bound depends explicitly on the degrees and heights of the algebraic numbers 7}, ..., ¥,). 

Here is an example of a measure of simultaneous approximation by algebraic numbers 
(see [97 RW2]). We define the absolute logarithmic height of an algebraic number y by 


d 
1] 1 
h(y) = = logay + = } log max{1, |7iI} 


i=] 


when the minimal polynomial ay) X¢ +---+ ay of y over Z (with ag > 0) decomposes over 
C as 


d 
a0 ate: ae Abe 
i=l] 
Proposition 2.1 Let B,..., By, be Q-linearly independent algebraic numbers. There exists 
a positive constant C = C(B,..., Bn) such that, if y1,...,¥n are algebraic numbers 
satisfying 
(Q(M1,---.%n) > Q)< D- and max h(yj) <A 


with h > e, then 


je"! — yf tees + fe = yy | 
> exp(—CD!*(/%p dog h + D log D)(logh + log D)7}. 


In the case n = 1, it is not difficult to deduce from a measure of algebraic approximation 
for 0 = 0; a transcendence measure of @, i.e. alower bound for |P(6)|, when P € Z[X]isa 
non Zero polynomial (this lower bound is explicit in terms of the degrees and heights of P). 
Conversely, from a transcendence measure one easily deduces a measure of approximation. 

Now assume one wants to prove that some field K = Q(6],...,6,) has transcendence 
degree > 2 (result of small transcendence degree). Assume not: if @ denotes any transcen- 
dence basis, then the field K is a finite extension of Q(@). Using the same scheme of proof 


>We have the disposition of strong general zero estimates for commutative algebraic groups; see for instance 
Bertrand, Daniel-Lemmes de zéros et nombres transcendants. Séminaire Bourbaki, vol. 1985/86. Astérisque 
No. 145-146 (1987), 3, 21-44. 
In the proofs of the results stated in section 3 a result of P. Philippon plays a fundamental role. However, so far, 
apart from the setting of algebraic groups, the situation is not as good. 
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as above, one constructs a number y in K, and one meets a difficulty for the lower bound. 
One solution, suggested by S. Lang in 1966 (in his book on transcendental numbers), 1s to 
assume a transcendence measure for 6; such a transcendence measure needs to be rather 
sharp (the field Q(@) should have “finite transcendence type’). This suggests an inductive 
procedure for algebraic independence. However, so far, results obtained by this way are 
rather weak, partly because the Diophantine estimates which can be proved are far from the 
best possible ones. Also this method is not universal, since some numbers like Liouville 
ones will not satisfy the desired measure. 

Another much more efficient solution arises from Gel’ fond’s work in 1949: the transcen- 
dence construction yields not only one element y € K, buta sequence yy of such elements. 
Instead of asking for a lower bound for each single yy, Gel’ fond succeeded to prove that 
not all elements yy in this sequence can be too small. 

One should point out that it is often possible to complete transcendence proofs (at least 
when they involve only functions of a single variable) without a sharp zero estimate, but for 
algebraic independence (as well as for quantitative results), one really needs precise zero 
estimates. 


2.2 Criteria for Algebraic Independence and Measures 


The proof of A.O. Gel’fond’s criterion involved only elementary properties of resultants 
of two polynomials in a single variable. It enables one to prove that some fields have 
transcendence degree > 2 over Q. In few cases it shows that two numbers are algebraically 
independent (one such example is A.O. Gel’ fond’s solution of Conjecture |.2 for d = 3). 
Several authors contributed to refine it (see [84 W1]). Criteria for algebraic independence 
(“large transcendence degree”) have been worked out by G.V. Chudnovsky (see Chap. 4 
of [84 Chu]), E. Reyssat, R. Dvornicich, Yu. V. Nesterenko, P. Philippon, M. Waldschmidt 
and Zhu Yao Chen (see [84 W1]), as well as Zhu Yao Chen ((85 Z1], [89 Z1], [89 Z2], [90 
Z|). One main tool, due to G.V. Chudnovsky, is the semi-resultant. These statements were 
sufficient to yield lower bounds for the transcendence degree of fields generated by values 
of the exponential function for instance, but these lower bounds were not sharp. The first 
sharp estimate has been announced by P. Philippon in [84 P] and proved in [86 P]. Here 
is Corollary 0.6 of [86 P] (the main result of [86 P] also splits degree and height; further 
statements and references are given in [98 NP}). 


Theorem 2.2 (Philippon’s Criterion for Algebraic Independence). Let n be a poSitive 
integer, C a sufficiently large real number, (6, ..., 9,) an element of C” and n a posSitive 
real number. Assume that for all sufficiently large N, there exist a positive integerm = 
m(N) => 1 and polynomials Q\,..., Om in Z[X\,..., Xn] with 


max deg Q; <N, max H(Q;) < eN 
I<j<m l<j<m 


and 


= ui 
O< max |O;(91,..-,On)| <e a ’ 
l<j<m 
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such that the set of common zeros of the polynomials Q,,..., Qm in the ball 


ae I} 
ze cr, max |z; —6;| <e 26 


l<i<n 
is finite. Then the transcendence degree over Q of the field Q(Q,,...,On) is > n — 1. 


This statement allows one to give lower bounds for the transcendence degree of some 
fields. Quantitative refinements of such criteria have been studied by several authors: 
the goal is now to produce a result of Diophantine approximation, which includes the 
lower bound for the transcendence degree. When the conclusion of the proof of algebraic 
independence is that some numbers 6], ... , 0, are algebraically independent (an example is 
the Lindemann-WeierstraB Theorem, another one is A.O. Gel’ fond’s result on a? and ab. 
a few other examples are due to G.V. Chudnovsky), the most natural quantitative refinement 
is a lower bound for |P(6;,...,9,)|, when P is a non zero polynomial with rational (or 
algebraic) coefficients. This is called a measure of algebraic independence. As shown by 
P. Philippon [85 P] and Yu.V. Nesterenko [85 Nel], similar measures can also be given 
when the conclusion is only that the transcendence degree of the field Q(6),..., 6,) over 
Q is greater than some number k which is < n. Another quantitative refinement is to give 
a lower bound for max}<;<, |0; — y;| when y;,..., y, belong to a field of transcendence 
degree < k; this is called a measure of simultaneous approximation for 6,,..., 0). An 
example with k = 0 is Proposition 2.1. In [85 P], P. Philippon compares both measures; 
he produces measures of algebraic independence and deduces measures of simultaneous 
approximations for various transcendental numbers. 

In [88 Ab] and [89 Ab], M. Ably proves a criterion for measures of simultaneous approx- 
imation which yields sharper estimates than [85 P] for numbers of the form ei, a", 
and also for G.V. Chudnovsky’s result on 7/w and 2/w. This later example is inspired by 
G. Philibert’s result in [88 Ph] (see Theorem 3.1 below). 

In [92 Jb], E.M. Jabbouri improves Philippon’s results [85 P] and [86 P] and produces 
a criterion which yields effective measures of algebraic independence, splitting degree and 
height. 

In order to prove general results for one parameter subgroups of commutative algebraic 
groups, M. Ably in [92 Ab2] combines the previous criteria for measures of algebraic 
independence [92 Jb] and measures of simultaneous approximation [89 Ab] into a single 
one. Finally C. Jadot [96 Jd] refines the previous criteria of algebraic independence due to 
P. Philippon and E.M. Jabbouri, so that he also includes criteria for linear independence due 
to Yu. V. Nesterenko, P. Bundschuh and T. Topfer. Again Jadot’s results are effective (i.e. 
his criterion can be used to produce measures of linear independence as well as measures 
of algebraic independence). 

The reference [98 NP] contains an extensive exposition of the basic tools from commu- 
tative algebra and elimination theory, including Chow forms, which have been introduced 
by Yu.V. Nesterenko in this context. Following a lecture of D. Roy, we only give here the 
definition of the Chow form associated with an algebraic variety V C P,(Q) of dimension 
k defined over the field Q of rational numbers. Let 


U; = Ug? Xo+--- + UP Xm = OK<i <b 
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be generic linear forms. There exists an irreducible polynomial 
PS OU 322310) 


where U“? stands for the m-tuple (UO, rei UD). such that, for any linear forms Lo, ..., Lx 
in Q[Xo. ..., Xm]. Say 


a tO Xg te tuPXny (O<i <k) 


the condition 
Feu ae u*)) ==) 


holds if and only if 
Lo(x) =--- = L(x) = 0 


have a solution x in V. This form F, which is unique up to a multiple in Q”, is the Chow 
form of V. 


2.3 Hilbert Nullstellensatz 


The first results on large transcendence degree for values of elliptic functions are due to 
D.W. Masser and G. Wiistholz in the early 80’s. They proved elliptic analog of Chudnovsky’s 
result on the exponential function for elliptic functions with algebraic invariants without 
complex multiplication. In order to provide a replacement for the elimination technique 
of G.V. Chudnovsky by means of semi-resultants, they prove (and use) an explicit lower 
bound for the degree in Hilbert’s Nullstellensatz. They need also tools from commutative 
algebra in order to refine their zero estimate. Another tool is an effective elliptic version of 
a Theorem of Kolchin on subgroups of products of algebraic groups. 

The bounds of D.W. Masser and G. Wiistholz for the degree in Hilbert’s Nullstellensatz 
were not sharp. W.D. Brownawell’s succeeded to get essentially best possible estimates, 
by means of the powerful tools of Yu.V. Nesterenko and P. Philippon in elimination theory, 
as well as deep results from the theory of several complex variables of C. Berenstein and 
A. Yger. However, the method of Masser-Wiistholz then produces weaker results than 
Philippon’s one in [86 P] (already in 1983 P. Philippon had much stronger results for large 
transcendence degree than those obtained by D.W. Masser and G. Wiustholz). 

However an alternative approach to this question has been worked out by 
W.D. Brownawell in [89 Br]. He establishes sharp lower bounds for the maximum abso- 
lute values of integral polynomials having no common zero within a ball of given radius 
centered at a fixed point in C”. This allows him to relax the technical hypothesis: in place 
of a measure of linear independence, he only needs to assume that some lower bound holds 
for infinitely many values of the parameter. In [89 B], he works out the case of algebraic 
independence of values of the exponential function, or of elliptic functions without complex 
multiplication. The case of complex multiplication is considered in his joint paper with 
R. Tubbs [89 BT]. These two papers [89 B] and [89 BT] also give sharp effective quantita- 
tive measures of algebraic independence. Further references on this topic are [91 Tu] and 
[96 Den2]. 
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2.4 Simultaneous Diophantine Approximation and Algebraic Independence 


A new link between simultaneous approximation and algebraic independence arises from 
the work of D. Roy, M. Waldschmidt, M. Laurent, P. Philippon and G. Diaz (see [95 RW], 
[97 Di 2], [97 P], [97 RW1], [97 RW2], [98 LR2], [98 PI], [98 P2] and [98 P3}). 

The basic idea is as follows. A complex transcendental number 6 can be approximated 
by algebraic numbers. For instance Dirichlet’s box principle enables one to produce a non 
zero polynomial P € Z[X] such that |P(@)| is small; then one considers the root of P 
which is closest to 9. This last step involves some technical complications, but E. Wirsing® 
showed in 1961 how to overcome them. Denote by M(y) Mahler’s measure of the algebraic 
number y, which is defined by 

M(y) = en 


where d = [Q(y) : Q] is the degree of y. 


Theorem 2.3 (Wirsing). For any integer D > 2 there exist infinitely many algebraic 
numbers y which satisfy 


[Q(y) : Q] < D, and |@—yl <M(y) 2". 


Wirsing’s arguments are related with Gel’fond’s proof of his transcendence criterion. 
Apart from a result by H. Davenport and W.M. Schmidt (1968), the constant in Wirsing’s 
result has not yet been substantially improved, but many useful recent variants have recently 
been produced. In [97 RW 1] and [97 RW2], one considers more generally the simultaneous 


approximation of complex numbers by algebraic numbers. Given 6;,...,6, in C, one 
shows the existence of algebraic numbers y;,..., y,, such that max |@; — y;| is bounded 
from above in terms of the degree Q(y|,..., 7), as well as the heights of y;’s. The 
transcendence degree over Q of the field Q(6;,...,6,,) is the invariant which should, at 
least conjecturally, control the situation. 

For instance, assume that the transcendence degree over Q of the field K = Q(6|, ..., 4) 


is 1. It is not too hard to check that an algebraic approximation of a single transcenden- 
tal element in K will enable one to produce “simultaneous” algebraic approximations to 
0,,...,9,. In other terms, if 0), ..., 6, have the property that they cannot be simultane- 
ously approximated by algebraic numbers, then they cannot lie in a field of transcendence 
degree |, and two at least of them are algebraically independent. This 1s how a link between 
simultaneous Diophantine approximation and algebraic independence arises. 

Usually, “simultaneous” approximation (for instance of real numbers by rational ones) 
means that one has a good control of acommon denominator of the rational approximations; 
more generally, “simultaneous” approximation of complex numbers by algebraic ones usu- 
ally means that one has a good control of the absolute logarithmic height of the projective 
point whose coordinates are the algebraic approximations. Here, we do not pay too much 
attention to the constants, but we are interested with algebraic numbers of large degree, and 
we are concerned with bounds for the degree of the number field generated by the algebraic 
approximations. 


Wirsing, Eduard — Approximation mit algebraischen Zahlen beschrankten Grades. J. reine angew. Math. 206 
(1960), 67-77. 
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The following statement is Conjecture 1.7 of [97 RW2]. 


Conjecture 2.4 Leta > 1, b > 1 be real numbers, ¢ a non negative integer and 01, ..., 9, 
complex numbers. Let g : N x Rx, — R, U {co} be a mapping with the following 
property: there exist a positive integer Do together with a real number ho > | such that, 
for any integer D > Do, any real number h > ho and any n-tuple (y1, ..., ¥,) of algebraic 
numbers satisfying 


[(Q(M1, ---,¥n) : Q]) < D and ae 17: </h, 


we have 
max |6; — y;| => exp{—g(D, h)}. 


l<i<n 
Let (D,)y>1 be an increasing’ sequence of positive integers and (h,),>, an increasing 
sequence of positive real numbers with D,, + h, — oo. Assume 


Dy+1 < aD,, Aya) = bhy, (v = 1) 


and 


lim sup 
v— co 


Then the transcendence degree over Q of the field Q(6|, ..., 6,) is > tf. 


l 
——_——- 9(D,, hy) = 0 
pitt). sa 


So far, only the case n = 1 1s solved, by M. Laurent and D. Roy [98 LR2]. 

Lindemann-WeierstraB Theorem on the algebraic independence of eFl | ePm (when 
B,..., By, are Q-linearly independent algebraic numbers) follows from Conjecture 2.4 and 
Proposition 2.1. The special case n = 2 (algebraic independence of e?! and e2) follows 
from Proposition 2.1 together with the special case of Conjecture 2.4 which is proved in 
[97 RW2]. 

This approach yields results of algebraic independence (only small transcendence degree, 
so far) by means of Laurent’s interpolation determinants. Previously, the method of inter- 
polation determinants was limited to proofs involving only algebraic numbers: it was effi- 
cient only for showing that some fields have transcendence degree > 1, as well as for 
providing quantitative versions of such results, namely measures of transcendence or of 
simultaneous algebraic approximation by algebraic numbers. The point is that such mea- 
sures turn out to yield algebraic independence results! 

Another solution to the problem of proving algebraic independence by means of inter- 
polation determinants has been found by M. Laurent and D. Roy: in [98 LR1] they first 
extend Gel’fond’s criterion by including multiplicities; next they show that the interpola- 
tion determinant produces a polynomial which, not only is small at some point, but also has 
small derivatives at that point. 

In a forthcoming paper they also include derivatives in Philippon’s criterion of algebraic 
independence. This will produce results of large transcendence degree by means of inter- 
polation determinants. Unfortunately these new tools, which are definitely valuable from 


Increasing sequences are sometimes called non decreasing sequences: we do not exclude a constant sequence 
for instance. 
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the point of view of the method, do not seem to produce new results which could not be 
reached by previous arguments, so far. 

Notice also that A. Sert’s proof in [97 Se] of a measure of algebraic independence for 
e?\.., en (quantitative version of the Lindemann-WeierstraB Theorem) involves inter- 
polation determinants. Another way of replacing the auxiliary function by an interpolation 
determinant has been worked out by P. Philippon in [97 P], [98 Pl] and [98 P2]. One 
main difference between [97 RW1] and [97 P] is that Philippon considers approximation 
by algebraic cycles rather than by algebraic numbers. For instance, in the one dimensional 
case, given a transcendental number 9, instead of considering |9 — y| for some algebraic 
number y, he only needs to get a non zero polynomial P € Z[X] such that | P(@)| is small; 
and Dirichlet’s box principle is just sufficient! 


3 Gel’fond’s Method and its Developments 


3.1 Small Transcendence Degree 


We start with “small transcendence degree”. Many very interesting results are due to 
G.V. Chudnovsky [84 Chu], [84 W1]. We recall the following one: Let E be an elliptic 
curve over Q, w anon zero period of a differential form of first kind and n the corresponding 
period of a differential form of second kind. Then the two numbers w/x and n/n are 
algebraically independent. 

Let € be an algebraic number and € the elliptic curve y? = (1—x*)(1 — €x?). In 
[96 An], Y. André states Chudnovsky’s result in terms of values of hypergeometric func- 
tions, and gives a completely new proof by means of the Gel’ fond-Deébes method involving 
G-functions. Also he produces p-adic analog. 

Here is an extension of G.V. Chudnovsky’s result to Abelian varieties, due to K.G. Vasilev 
[96 V]. Let F € Q[X, Y] be an irreducible polynomial such that the curve F(x, y) = Ohas 
genus g > 1. Let g,..., @2¢ be differentials of first or second kind and yj, ..., y2g be 
closed integration paths on the Riemann surface such that the 2g x 2g matrix 


YI 1<i,j<2g 


is non singular. Then any g + 1 rows of Q2 contains at least 2 algebraically independent 
numbers. As a consequence, any subset with [(n + 1)/2] elements of the following set of 
values of the Beta function 


{B(k/n, 1/2); 1<k<n-—1,k #n/2} 


contains at least 2 algebraically independent numbers. 

The fact that (1/4) and (1/3) are not Liouville numbers, which was claimed by 
G.V. Chudnovsky already in 1980 (see also [84 Chu] Th. 3 p. 18), has been proved only 
recently by P. Philippon in [98 P2]. More generally Philippon produces a sharp measure of 
algebraic independence of 2 /w and n/w: 


Theorem 3.1 Let 9 be a Weierstraf elliptic function with algebraic invariants g, g3. Let w 
be anon zero period, and n be the corresponding quasi-period of the associated Weierstra$ 
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zeta function. There exists a constant C > 0 such that, for any non zero polynomial 
P € 2[X, Y] of degree < D and usual height < H, 


|P (1 /w, n/w)| > exp{—CD? (log H + D log D)}. 


After Chudnovsky’s pioneer work, previous weaker estimates had been obtained by 
G. Philibert in [88 Ph], by M. Ably in [89 Ab] and by E.M. Jabbouri in [92 Jb]. Notice that 
G. Philibert’s measure in [88 P] played an interesting role in connection with the work of 
P. Philippon related to the algebraic independence of 7 and e” (see §4 below). 

Gel’ fond’s algebraic independence Theorem (1949) on a? and co has been extensively 
studied. The measure of algebraic independence he obtained with N.I. Feldman has been 
improved, especially by G. Diaz in [87 Dil], [88 Di], [89 Dil] and [90 Di]; see also [90 She]. 
Here is the best known result by G. Diaz : 


Theorem 3.2 Let a be anon zero algebraic number, loga anon zero logarithm of a and 
B acubic number. There exists a constant c > O such that if P € Z[X, Y] is a non zero 
polynomial of degree < D and usual height < H, then 


log| P(a?, a” )| > —exp{cD(D + log H)}. 


The elliptic analog of Gel’ fond’s Theorem on a? and oh” announced by D.W. Masser and 
G. Wiistholz in the early 80’s, has been proved in [86 MW]. Much sharper results have 
then been achieved by P. Philippon in 1983. In [86 Tu2], R. Tubbs gives a lower bound 
for |P(go (Bu), 9 (B7u))| in the CM case (here, 9 is an elliptic function of Weierstrah 
with algebraic invariants and complex multiplication, u is a complex number such that 
so (u) is algebraic, P a polynomial with rational integer coefficients and £ is an algebraic 
number of degree 3 over the field of complex multiplications). In [88 Tu], he removes 
the assumption that so (u) is algebraic and produces a lower bound for pairs of relatively 
prime integral polynomials evaluated at (49 (uv), go (Bu), 2 ( B7u))| (elliptic analog of a result 
of W.D. Brownawell in 1979). The elliptic analog of Theorem 3.2 has been established 
by S.O. Shestakov [92 She]. For an updated survey of such questions with new results, 
see [96 Ca]. 

General results of small transcendence degree related with one parameter subgroups of 
commutative algebraic groups are obtained in [85 W]. The main difference with [86 W] is 
that no technical hypothesis is needed for small transcendence degree. These results do not 
exhaust the possibilities of the method: some results by G.V. Chudnovsky [84 Chu] are not 
covered by the statements of [85 W] and many variations are possible (see for instance the 
papers [86 TY] by M. Toyoda and T. Yasuda and [86 Tul], [87 Tul] by R. Tubbs). A more 
systematic study has been achieved by R. Tubbs in [87 Tu2] and [90 Tu2]. Further results 
related to Baker’s method (see Chap. 2 of [84 Chu]) are due to G. Diaz [92 Dil. 

Transcendence measures enable one to construct algebraically independent numbers: an 
early example is due to N.J. Feldman (1964): using a lower bound for |B —log a| whena and 
f are algebraic numbers, he proved that for some “Liouville” numbers a, the two numbers a 
and e“ are algebraically independent. Here, “Liouville” number means a complex number 
a which is very well approximated by rational, or more generally algebraic, numbers. 
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In the same way, R. Tubbs [90 Tul] uses a lower bound for elliptic linear forms in 
logarithms and proves, for algebraic B, the algebraic independence of u and (fu) for 
some Liouville numbers u, and also the algebraic independence of go(u) and go(Bu) for 
some complex numbers u such that go(u) 1s Liouville. In [93 Ca], D. Caveny shows the 
algebraic independence of a and a? when a and b are Liouville numbers; with R. Tubbs 
in [93 CT] they refine Fel’dman’s above mentioned result on a and e“ and also discuss 
the elliptic analog. Using a lower bound for linear forms in logarithms, D. Caveny gets in 
(94 Ca] the algebraic independence of a and a? for a Liouville number a and an algebraic 
f. A similar elliptic situation is considered by M. Ably in [91 Ab]. 

Usually, each single recent paper deals with either transcendence degree > 2 or else large 
transcendence degree. One of the few exceptions are [89 Tu] where R. Tubbs produces fields 
of transcendence degree > 3 generated by values of an elliptic function, and [98 P2]. 

We conclude this section with a partial result concerning Conjecture 1.5. It is still not 
proved that the field generated by all the logarithms of algebraic numbers has transcendence 
degree > 2 over Q: it might just as well be an algebraic extension of Q(z) (but this would 
be unexpected!). One of the very few known results in this direction arises from [97 RW1] 
and [97 RW2]: 


Theorem 3.3 Let £1, ..., &, be Q-linearly independent elements of £L. Assume that the 
transcendence degree over Q of the field Q(€,,...,&n) is 1. Then for each non zero 
homogeneous polynomial Q € Q[X\,..., Xn] of degree 2 (quadratic form), the number 
QO(£1,..., £n) does not vanish. 


3.2 Large Transcendence Degree 


In [84 WI], we noticed that Philippon’s results on the elliptic analog of the Gel’ fond- 
Schneider problem were stronger than the exponential result. Shortly after, P. Philippon 
succeeded to prove the expected result in the exponential case. 

Let ¢ be the transcendence degree over Q of the field Q(a?, ob? Meta be! ). After the 
works of G.V. Chudnovsky, P. Warkentin, P. Philippon, E. Reyssat, R. Endell, 
W.D. Brownawell and Yu. V. Nesterenko (see [84 W1] as well as [84 Ne! ]), the best known 


lower estimate for ¢ was 
i d+1 
7 
In 1984 P. Philippon succeeded to reach the lower boundt > (d—1)/2, which can be written 
t > {[d/2] where[ | denotes the integral part. In [84 P] he announced the result, together with 
a sharp criterion for algebraic independence, as well as a quantitative refinement (measure 
of algebraic independence) which he discussed in [85 P]. Proofs are provided in [86 P]. 
They involve the tools from commutative algebra introduced earlier by Yu.V. Nesterenko. 
Algebraic independence measures are given in [85 Nel] and [86 Ne]. Further references 
on this topic are [85 P] and [87 Ne]. It should be pointed out that the arguments used in the 
proofs also work in the context of Mahler’s method (see [85 Nel! }). 
In 1987, G. Diaz [87 Di] obtained a refinement, which is the best known estimate so far: 
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Theorem 3.4 (G. Diaz). Let a be a non zero complex algebraic number, log a a non zero 
logarithm of a and B an algebraic number of degree d > 2. Then the transcendence degree 
t over Q of the field Q(a®, w®’,..., w8""') satisfies 
d—1 
eer 
Z 

An equivalent formulation is t > {(d+1)/2]. One main tool in Diaz’ proof is Philippon’s 
Criterion (Proposition 2.2). Another important feature of Diaz’ proof is that he succeeds to 
work out an idea, going back to G. V. Chudnovsky, which consists in deriving the coefficients 
of the auxiliary polynomial with respect to a transcendence basis. The elimination process 
has been revisited by Yu. V. Nesterenko [89 Ne] who avoids the use of Philippon’s criterion. 

All above works, including Diaz’ results in [89 Dil] and [89 Di2], deal more generally 
with fields generated over Q by numbers of the form e*'’), where x;,..., xg are Q-linearly 
independent complex numbers, and also y;,..., ye are Q-linearly independent numbers. 
Of course, in such a general setup, a technical hypothesis is needed. An extension to large 
transcendence degree of Diaz’ result [92 Di] using Baker’s method is due to Chen Gong 
Liang [93 Chn], [95 Chn]. 

Several quantitative refinements of the Lindemann-WeierstraB Theorem, i.e. measures 
of algebraic independence for e*!, ..., e®", are known. They used to be produced by means 
of Siegel-Shidlovskii’s method involving E-functions. In 1994, M. Ably [94 Ab] improved 
such an estimate from Nesterenko (1977) by means of Gel’ fond-Chudnovsky’s method, 
using the criterion of Jabbouri-Philippon [92 Jb]. The result in [94 Ab] is “effective’’, but 
not all constants are explicitly computed. A completely explicit measure is given in A. Sert’s 
thesis [97 Se]. 

Here is the elliptic analog of the Lindemann-WeierstraB Theorem: 


e Let ~ be a Weierstraf elliptic function with algebraic invariants. Denote by k the 
field of endomorphisms of go (which is either Q — non CM case — or else an imagi- 
nary quadratic field — CM case —). If Bi, ..., By are k-linearly independent complex 
numbers, then the field Q(9 (Bi), ..., (9 (Bn)) has transcendence degree > n/{k : Q] 
over Q. 


As pointed out in [84 W1], the case of complex multiplication had been settled in 1983 by 
P. Philippon and G. Wiistholz, independently. In fact, the non CM case also follows from 
the works of P. Philippon and G. Wiistholz in 1983. In [86 Jb], E.M. Jabbouri produces 
quantitative results, including measures of algebraic independence in the CM case. His 
result deals more generally with Abelian varieties. His proof uses Philippon’s zero estimate 
as well as the elimination techniques from [86 P]. Extensions of Lindemann-WeierstraB 
Theorem to algebraic groups, including the quantitative aspect, have also been considered 
in [87 P]. Jabbouri’s measure of algebraic independence [86 Jb] has been refined by M. Ably 
in [92 Abl1], by means of a larger number of redundant variables. In the case of elliptic 
functions, Yu. V. Nesterenko’s measure of algebraic independence for 9 (B1), ..., 69 (By) in 
the CM case [92 Ne] is better (see also [91 Ne]). The sharpest known results in this context 
are due to Yu.V. Nesterenko [95 Ne]. 

General results on large transcendence degree are obtained in [84 W2] for the exponen- 
tial function in several variables, in [84 W3] for one parameter subgroups of products of 
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linear groups and elliptic curves (giving mixed exponential-elliptic algebraic independence 
results) and in [86 W] for several parameters analytic subgroups of commutative algebraic 
groups. Here is an example (Corollary 15.4 of [86 W]), which displays a new kind of 
technical hypothesis. 


Theorem 3.5 Let A be a simple abelian variety of dimension g defined over the field of 
algebraic numbers, V a vector subspace of dimension n of the tangent space at the origin 
of A, and Y = Zy, +---+ Zym a finitely generated subgroup of V of rank m. Let K be 
a subfield of C of transcendence degree t over Q, such that exp, Y C A(K). Assume that 
for any € > O there exists Hy > 0 such that, for any H > Ho, any (hj,..., hm) € Z” with 


O < max{{h\|,..., |A,|} < A, 
and any w € ker exp,, the condition 
lniy) +--+: +Amym — w| < exp(—H*) 
implies hiy, +---+hmym = w. Then 


mg 
> ———_—— - 1 
~ n(m + 2g) 


For one parameter subgroups (i.e. n = 1), M. Ably [89 Ab], [92 Ab2] and R. Tubbs 
[91 Tu] improved the conclusion and got a strict inequality, which meanst > [mg/(m+2g)]. 
The method ts inspired by Diaz [89 Dil]. The elliptic analog of Diaz’s Theorem 1.3 has 
been obtained by M. Ably [92 Ab2] and S.O. Shestakov [91 She], [92 She]: 


Theorem 3.6 Let 9 be a Weierstraf elliptic function with algebraic invariants g2, g3. 
Denote by F the field of endomorphisms of the associated elliptic curve, and byé = [F : Q] 
the degree of F (that means 6 = 2 in the CM case, 6 = | otherwise). Let B be an 
algebraic number of degree d > 2/5 and u a complex number such that none of the d 
numbers u, Bu,..., B?‘u is a pole of go. Then the transcendence degree over Q of the 
field Q(o (u), (9 (Bu), ..., 9 (B? |u)) is at least 


| 4 | for 5’ = 2(CM case), 
|e | jor 6:=T. 


In [91 Ab], M. Ably improves the result (adding | to the transcendence degree) in the 
special case where one of the numbers £9 (uv), (Bu), ..., 6 (B4~!u) has extremely good 
approximations by algebraic numbers. 

In [92 Ab2], M. Ably gives quantitative refinements to the result of [86 W] in the one 
dimensional case, involving a weakened technical hypothesis. These far reaching estimates 
on algebraic independence for one parameter subgroups of commutative algebraic groups 
have been refined by D. Caveny [96 Ca], who splits the dependence in the degree and height, 
like in G. Diaz’ Theorem 3.2. 
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4 Modular Functions 


The first results on algebraic independence of values of modular functions are due to 
D. Bertrand. The proofs involved elliptic functions. Only in 1995 a group of mathe- 
maticians in St Etienne® succeeded to prove a transcendence result by means of modular 
functions: they solve a problem of Mahler in the complex case and of Manin in the p-adic 
case by proving the transcendence of J(q) when q is an algebraic (complex or p-adic) 
number in the domain 0 < |g| < 1, and J is the modular invariant. In the complex case, 
this result solves a special case of the “mixed four exponentials Conjecture’. In fact deep 
connections with the classical four exponentials exist [97 Dil]. 

One year later Nesterenko [96 Nel], [96 Ne2] obtained a remarkable result of algebraic 
independence: 


Theorem 4.1 (Nesterenko). Let g be a complex or p-adic number satisfying 0 < |q| < 1. 
Then three at least of the four numbers 


q, P(q), Q(q), R(q) 


are algebraically independent. 


These functions P, Q and R are the Eisenstein series of weights 2, 4 and 6 respectively. 
In the complex case, an equivalent statement in terms of WeierstraB elliptic functions reads 
as follows: 


Corollary 4.2 Let g2 and g3 be two complex numbers satisfying 25 x 279%. Denote by 
W 1, W2 a pair of fundamental periods of the elliptic curve y? = 4x3 — gox — g3, and by 
N1, 2 the corresponding quasi-periods. Then the transcendence degree over Q of the field 


2imw /@2) 


Q(g2, 23, @1/7, 11/1, e 
is at least 3. 


Notice that without e7/7@1/@2, it was known by G.V. Chudnovsky (see §3.1 above, as 
well as [84 Chu] and [84 W1]) that the transcendence degree 1s at least 2. 
One of the most impressive consequences of Corollary 4.2 1s: 


Corollary 4.3 The three numbers 1, e” and (1/4) are algebraically independent. 


Further references on this topic include [97 Ber], [97 Ne], [97 W], [98 Gr], [98 La], [98 
P1], [98 Sar] and [98 W1]. In [96 Nel], [96 Ne2], [97 Ne] and (98 P1], Yu. V. Nesterenko and 
P. Philippon give sharp quantitative measures of algebraic independence for the numbers 
in Corollary 4.3. Philippon’s works [98 P1] and [98 P3] also apply to Mahler’s method, 
and more generally to a new class of functions which he calls K -functions, completing the 
classes of E- and G-functions introduced by C.L. Siegel in 1929. 


8 Barré-Sirieix, Katia; Diaz, Guy; Gramain, Francois; Philibert, Georges — Une preuve de la conjecture de 
Mahler-Manin. Invent. Math. 124 (1996), no. 1-3, 1-9. 
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5 Diophantine Rings 


P. Philippon has introduced a general concept of “Diophantine ring” in [92 P] (see also 
[95 P]). This is a commutative Noetherian ring, with a valuation and a “size inequality”’. 
Special cases are Z and rings of integers of algebraic number fields on one hand, F,[T] and 
rings of functions of an algebraic curve on a finite field on the other hand. Also, if A is a 
Diophantine ring, then so is any ring which is finitely generated over A. In [92 P], Philippon 
extends his criterion of algebraic independence [86 P] to Diophantine rings. In [97 P], he 
proposes an axiomatic method for transcendence and algebraic independence. One main 
axiom is a Schwarz’ Lemma. This axiomatic point of view enables him to contain most 
transcendence and algebraic independence results involving functions of a single variable, 
including Gel’ fond’s method, Baker’s Theorem, Lindemann-Weierstra} Theorem, Mahler’s 
method as well as results related to Carlitz exponential and Drinfel’d modules. Section 8 
of [97 P] provides a very interesting speculative suggestion for solving Conjecture 1.2. 
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